Copy files in parallel by reading what files to copy from few other files

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
7
down vote

favorite
1












I am working on a little complicated shell script for the first time and this is what it should do:



  • During startup, it figures out what is my clientid by looking at host-mapping.txt file. If I cannot find clientid for my hostname then I need to exit from the shell script with a non-zero status code and log error message.

  • Now once I have a valid clientid, I will extract primary files from the primary-mappings.txt file and secondary files from the secondary-mappings.txt file for that valid clientid. If for whatever reason, I cannot find either primary or secondary files for that clientid from that file, then I will exit from the shell script and log an error message.

  • Now once I have valid primary and secondary files for that clientid then I will start copying those files in parallel using gnu-parallel from local_server. All primary files will go to primary folder and all secondary files will go to secondary folder. If files are not there in hold1 folder on remote servers then it should be there in the hold2 folder.

  • Now once all the files are copied, I will verify at the end to make sure all the primary and secondary files are present for that clientid in those two folders, but if for whatever reason I cannot find those files, then I want to exit from the shell script with a message that tells me what files are missing.

My script does the job but I would like to see if there is any better or efficient way to do these things since this is my first time writing a little complicated script so I wanted to check this out. As of now, I don't have a mechanism to exit out of the shell script if I cannot find primary or secondary files for that clientid. I also don't have a mechanism to exit out of the shell script if during the verification phase some files are missing.



#!/bin/bash
path=/home/goldy/scripts
mapfiles=(primary-mappings.txt secondary-mappings.txt)
hostfile=host-mapping.txt
machines=(machine1769.abc.host.com proctek5461.def.host.com letyrs87541.pqr.host.com)
# folders on local box where to copy files
primary=/data01/primary
secondary=/data02/secondary
# folders on remote servers from where to copy files
export hold1=/data/snapshot/$1
export hold2=/data/snapshot/$2

date1=$(date +"%s")
# this will tell me what's my clientid given my current hostname
getProperty () grep "$prop_value"
# if I can't find clientid for my hostname, then I will log a message
# and exit out of shell script with non zero status code
clientid=$(getProperty)
[ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)"; exit 1;

# now once I have valid clientid, then I will get primary and secondary mapping
# from the "host-mapping.txt" file
declare -a arr
mappingsByClientID ()
id=$1 # 1 to 5
file=$path/$mapfiles[$2] # 0 to 1
arr=($(sed -r "s/.*b$id=[([^]]+).*/1/; s/,/ /g" $file))
echo "$arr[@]"


# assign output of function to an array
pri=($(mappingsByClientID $clientid 0))
snd=($(mappingsByClientID $clientid 1))

echo "primary files: $pri[@]"
echo "secondary files: $snd[@]"

# figure out which machine you want to use to start copying files from
case $(hostname -f) in
*abc.host.com)
local_server=("$machines[0]")
;;
*def.host.com)
local_server=("$machines[1]")
;;
*pqr.host.com)
local_server=("$machines[2]")
;;
*) echo "unknown host: $(hostname -f), exiting." && exit 1 ;;
# ?
esac
export local="$local_server"

# deleting files before we start copying
find "$primary" -maxdepth 1 -type f -exec rm -fv ;
find "$secondary" -maxdepth 1 -type f -exec rm -fv ;

do_copy() (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

export -f do_copy
# copy files in parallel
parallel -j "$3" do_copy $primary ::: $pri[@] &
parallel -j "$3" do_copy $secondary ::: $snd[@] &
wait
echo "all files copied"

# this is for verification to see all files got copied or not
# in primary and secondary folder
set -- "$primary" "$secondary"
typeset -n array
for array in pri snd; do
for num in "$array[@]"; do
name="hello_monthly_$num_999_1.data"
if [ ! -f "$1/$name" ]; then
echo "$name" not found in "$1" >&2 && exit 1;
fi
done
shift
done

date2=$(date +"%s")
diff=$(($date2-$date1))
echo "Total Time Taken - $(($diff / 3600)) hours and $(((diff/60) % 60)) minutes and $(($diff % 60)) seconds elapsed."


Here is my host-mapping.txt file that will have lot more entries. Here, value is a valid hostname and key will be a string "k" followed by some number and that number should be there in the mapping files.



k1=machineA.abc.com
k2=machineB.abc.com
k3=machineC.def.com
k4=machineD.pqr.com
k5=machineO.abc.com


And here is my sample mapping files:



primary_mappings.txt



1=[343, 0, 686, 1372, 882, 196], 2=[687, 1, 1373, 883, 197, 736, 1030, 1569], 3=[1374, 2, 884, 737, 198, 1570], 4=[1375, 1032, 1424, 3, 885, 1228], 5=[1033, 1425, 4, 200, 886]


secondary_mappings.txt



1=[1152, 816, 1488, 336, 1008], 2=[1153, 0, 817, 337, 1489, 1009, 1297], 3=[1, 1154, 1490, 338], 4=[1155, 2, 339, 1491, 819, 1299, 1635], 5=[820, 1492, 340, 3, 1156]


For example: clientid 1 has 343, 0, 686, 1372, 882, 196 primary files and 1152, 816, 1488, 336, 1008 secondary files, similarly for other clientids as well.







share|improve this question



























    up vote
    7
    down vote

    favorite
    1












    I am working on a little complicated shell script for the first time and this is what it should do:



    • During startup, it figures out what is my clientid by looking at host-mapping.txt file. If I cannot find clientid for my hostname then I need to exit from the shell script with a non-zero status code and log error message.

    • Now once I have a valid clientid, I will extract primary files from the primary-mappings.txt file and secondary files from the secondary-mappings.txt file for that valid clientid. If for whatever reason, I cannot find either primary or secondary files for that clientid from that file, then I will exit from the shell script and log an error message.

    • Now once I have valid primary and secondary files for that clientid then I will start copying those files in parallel using gnu-parallel from local_server. All primary files will go to primary folder and all secondary files will go to secondary folder. If files are not there in hold1 folder on remote servers then it should be there in the hold2 folder.

    • Now once all the files are copied, I will verify at the end to make sure all the primary and secondary files are present for that clientid in those two folders, but if for whatever reason I cannot find those files, then I want to exit from the shell script with a message that tells me what files are missing.

    My script does the job but I would like to see if there is any better or efficient way to do these things since this is my first time writing a little complicated script so I wanted to check this out. As of now, I don't have a mechanism to exit out of the shell script if I cannot find primary or secondary files for that clientid. I also don't have a mechanism to exit out of the shell script if during the verification phase some files are missing.



    #!/bin/bash
    path=/home/goldy/scripts
    mapfiles=(primary-mappings.txt secondary-mappings.txt)
    hostfile=host-mapping.txt
    machines=(machine1769.abc.host.com proctek5461.def.host.com letyrs87541.pqr.host.com)
    # folders on local box where to copy files
    primary=/data01/primary
    secondary=/data02/secondary
    # folders on remote servers from where to copy files
    export hold1=/data/snapshot/$1
    export hold2=/data/snapshot/$2

    date1=$(date +"%s")
    # this will tell me what's my clientid given my current hostname
    getProperty () grep "$prop_value"
    # if I can't find clientid for my hostname, then I will log a message
    # and exit out of shell script with non zero status code
    clientid=$(getProperty)
    [ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)"; exit 1;

    # now once I have valid clientid, then I will get primary and secondary mapping
    # from the "host-mapping.txt" file
    declare -a arr
    mappingsByClientID ()
    id=$1 # 1 to 5
    file=$path/$mapfiles[$2] # 0 to 1
    arr=($(sed -r "s/.*b$id=[([^]]+).*/1/; s/,/ /g" $file))
    echo "$arr[@]"


    # assign output of function to an array
    pri=($(mappingsByClientID $clientid 0))
    snd=($(mappingsByClientID $clientid 1))

    echo "primary files: $pri[@]"
    echo "secondary files: $snd[@]"

    # figure out which machine you want to use to start copying files from
    case $(hostname -f) in
    *abc.host.com)
    local_server=("$machines[0]")
    ;;
    *def.host.com)
    local_server=("$machines[1]")
    ;;
    *pqr.host.com)
    local_server=("$machines[2]")
    ;;
    *) echo "unknown host: $(hostname -f), exiting." && exit 1 ;;
    # ?
    esac
    export local="$local_server"

    # deleting files before we start copying
    find "$primary" -maxdepth 1 -type f -exec rm -fv ;
    find "$secondary" -maxdepth 1 -type f -exec rm -fv ;

    do_copy() (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

    export -f do_copy
    # copy files in parallel
    parallel -j "$3" do_copy $primary ::: $pri[@] &
    parallel -j "$3" do_copy $secondary ::: $snd[@] &
    wait
    echo "all files copied"

    # this is for verification to see all files got copied or not
    # in primary and secondary folder
    set -- "$primary" "$secondary"
    typeset -n array
    for array in pri snd; do
    for num in "$array[@]"; do
    name="hello_monthly_$num_999_1.data"
    if [ ! -f "$1/$name" ]; then
    echo "$name" not found in "$1" >&2 && exit 1;
    fi
    done
    shift
    done

    date2=$(date +"%s")
    diff=$(($date2-$date1))
    echo "Total Time Taken - $(($diff / 3600)) hours and $(((diff/60) % 60)) minutes and $(($diff % 60)) seconds elapsed."


    Here is my host-mapping.txt file that will have lot more entries. Here, value is a valid hostname and key will be a string "k" followed by some number and that number should be there in the mapping files.



    k1=machineA.abc.com
    k2=machineB.abc.com
    k3=machineC.def.com
    k4=machineD.pqr.com
    k5=machineO.abc.com


    And here is my sample mapping files:



    primary_mappings.txt



    1=[343, 0, 686, 1372, 882, 196], 2=[687, 1, 1373, 883, 197, 736, 1030, 1569], 3=[1374, 2, 884, 737, 198, 1570], 4=[1375, 1032, 1424, 3, 885, 1228], 5=[1033, 1425, 4, 200, 886]


    secondary_mappings.txt



    1=[1152, 816, 1488, 336, 1008], 2=[1153, 0, 817, 337, 1489, 1009, 1297], 3=[1, 1154, 1490, 338], 4=[1155, 2, 339, 1491, 819, 1299, 1635], 5=[820, 1492, 340, 3, 1156]


    For example: clientid 1 has 343, 0, 686, 1372, 882, 196 primary files and 1152, 816, 1488, 336, 1008 secondary files, similarly for other clientids as well.







    share|improve this question























      up vote
      7
      down vote

      favorite
      1









      up vote
      7
      down vote

      favorite
      1






      1





      I am working on a little complicated shell script for the first time and this is what it should do:



      • During startup, it figures out what is my clientid by looking at host-mapping.txt file. If I cannot find clientid for my hostname then I need to exit from the shell script with a non-zero status code and log error message.

      • Now once I have a valid clientid, I will extract primary files from the primary-mappings.txt file and secondary files from the secondary-mappings.txt file for that valid clientid. If for whatever reason, I cannot find either primary or secondary files for that clientid from that file, then I will exit from the shell script and log an error message.

      • Now once I have valid primary and secondary files for that clientid then I will start copying those files in parallel using gnu-parallel from local_server. All primary files will go to primary folder and all secondary files will go to secondary folder. If files are not there in hold1 folder on remote servers then it should be there in the hold2 folder.

      • Now once all the files are copied, I will verify at the end to make sure all the primary and secondary files are present for that clientid in those two folders, but if for whatever reason I cannot find those files, then I want to exit from the shell script with a message that tells me what files are missing.

      My script does the job but I would like to see if there is any better or efficient way to do these things since this is my first time writing a little complicated script so I wanted to check this out. As of now, I don't have a mechanism to exit out of the shell script if I cannot find primary or secondary files for that clientid. I also don't have a mechanism to exit out of the shell script if during the verification phase some files are missing.



      #!/bin/bash
      path=/home/goldy/scripts
      mapfiles=(primary-mappings.txt secondary-mappings.txt)
      hostfile=host-mapping.txt
      machines=(machine1769.abc.host.com proctek5461.def.host.com letyrs87541.pqr.host.com)
      # folders on local box where to copy files
      primary=/data01/primary
      secondary=/data02/secondary
      # folders on remote servers from where to copy files
      export hold1=/data/snapshot/$1
      export hold2=/data/snapshot/$2

      date1=$(date +"%s")
      # this will tell me what's my clientid given my current hostname
      getProperty () grep "$prop_value"
      # if I can't find clientid for my hostname, then I will log a message
      # and exit out of shell script with non zero status code
      clientid=$(getProperty)
      [ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)"; exit 1;

      # now once I have valid clientid, then I will get primary and secondary mapping
      # from the "host-mapping.txt" file
      declare -a arr
      mappingsByClientID ()
      id=$1 # 1 to 5
      file=$path/$mapfiles[$2] # 0 to 1
      arr=($(sed -r "s/.*b$id=[([^]]+).*/1/; s/,/ /g" $file))
      echo "$arr[@]"


      # assign output of function to an array
      pri=($(mappingsByClientID $clientid 0))
      snd=($(mappingsByClientID $clientid 1))

      echo "primary files: $pri[@]"
      echo "secondary files: $snd[@]"

      # figure out which machine you want to use to start copying files from
      case $(hostname -f) in
      *abc.host.com)
      local_server=("$machines[0]")
      ;;
      *def.host.com)
      local_server=("$machines[1]")
      ;;
      *pqr.host.com)
      local_server=("$machines[2]")
      ;;
      *) echo "unknown host: $(hostname -f), exiting." && exit 1 ;;
      # ?
      esac
      export local="$local_server"

      # deleting files before we start copying
      find "$primary" -maxdepth 1 -type f -exec rm -fv ;
      find "$secondary" -maxdepth 1 -type f -exec rm -fv ;

      do_copy() (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

      export -f do_copy
      # copy files in parallel
      parallel -j "$3" do_copy $primary ::: $pri[@] &
      parallel -j "$3" do_copy $secondary ::: $snd[@] &
      wait
      echo "all files copied"

      # this is for verification to see all files got copied or not
      # in primary and secondary folder
      set -- "$primary" "$secondary"
      typeset -n array
      for array in pri snd; do
      for num in "$array[@]"; do
      name="hello_monthly_$num_999_1.data"
      if [ ! -f "$1/$name" ]; then
      echo "$name" not found in "$1" >&2 && exit 1;
      fi
      done
      shift
      done

      date2=$(date +"%s")
      diff=$(($date2-$date1))
      echo "Total Time Taken - $(($diff / 3600)) hours and $(((diff/60) % 60)) minutes and $(($diff % 60)) seconds elapsed."


      Here is my host-mapping.txt file that will have lot more entries. Here, value is a valid hostname and key will be a string "k" followed by some number and that number should be there in the mapping files.



      k1=machineA.abc.com
      k2=machineB.abc.com
      k3=machineC.def.com
      k4=machineD.pqr.com
      k5=machineO.abc.com


      And here is my sample mapping files:



      primary_mappings.txt



      1=[343, 0, 686, 1372, 882, 196], 2=[687, 1, 1373, 883, 197, 736, 1030, 1569], 3=[1374, 2, 884, 737, 198, 1570], 4=[1375, 1032, 1424, 3, 885, 1228], 5=[1033, 1425, 4, 200, 886]


      secondary_mappings.txt



      1=[1152, 816, 1488, 336, 1008], 2=[1153, 0, 817, 337, 1489, 1009, 1297], 3=[1, 1154, 1490, 338], 4=[1155, 2, 339, 1491, 819, 1299, 1635], 5=[820, 1492, 340, 3, 1156]


      For example: clientid 1 has 343, 0, 686, 1372, 882, 196 primary files and 1152, 816, 1488, 336, 1008 secondary files, similarly for other clientids as well.







      share|improve this question













      I am working on a little complicated shell script for the first time and this is what it should do:



      • During startup, it figures out what is my clientid by looking at host-mapping.txt file. If I cannot find clientid for my hostname then I need to exit from the shell script with a non-zero status code and log error message.

      • Now once I have a valid clientid, I will extract primary files from the primary-mappings.txt file and secondary files from the secondary-mappings.txt file for that valid clientid. If for whatever reason, I cannot find either primary or secondary files for that clientid from that file, then I will exit from the shell script and log an error message.

      • Now once I have valid primary and secondary files for that clientid then I will start copying those files in parallel using gnu-parallel from local_server. All primary files will go to primary folder and all secondary files will go to secondary folder. If files are not there in hold1 folder on remote servers then it should be there in the hold2 folder.

      • Now once all the files are copied, I will verify at the end to make sure all the primary and secondary files are present for that clientid in those two folders, but if for whatever reason I cannot find those files, then I want to exit from the shell script with a message that tells me what files are missing.

      My script does the job but I would like to see if there is any better or efficient way to do these things since this is my first time writing a little complicated script so I wanted to check this out. As of now, I don't have a mechanism to exit out of the shell script if I cannot find primary or secondary files for that clientid. I also don't have a mechanism to exit out of the shell script if during the verification phase some files are missing.



      #!/bin/bash
      path=/home/goldy/scripts
      mapfiles=(primary-mappings.txt secondary-mappings.txt)
      hostfile=host-mapping.txt
      machines=(machine1769.abc.host.com proctek5461.def.host.com letyrs87541.pqr.host.com)
      # folders on local box where to copy files
      primary=/data01/primary
      secondary=/data02/secondary
      # folders on remote servers from where to copy files
      export hold1=/data/snapshot/$1
      export hold2=/data/snapshot/$2

      date1=$(date +"%s")
      # this will tell me what's my clientid given my current hostname
      getProperty () grep "$prop_value"
      # if I can't find clientid for my hostname, then I will log a message
      # and exit out of shell script with non zero status code
      clientid=$(getProperty)
      [ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)"; exit 1;

      # now once I have valid clientid, then I will get primary and secondary mapping
      # from the "host-mapping.txt" file
      declare -a arr
      mappingsByClientID ()
      id=$1 # 1 to 5
      file=$path/$mapfiles[$2] # 0 to 1
      arr=($(sed -r "s/.*b$id=[([^]]+).*/1/; s/,/ /g" $file))
      echo "$arr[@]"


      # assign output of function to an array
      pri=($(mappingsByClientID $clientid 0))
      snd=($(mappingsByClientID $clientid 1))

      echo "primary files: $pri[@]"
      echo "secondary files: $snd[@]"

      # figure out which machine you want to use to start copying files from
      case $(hostname -f) in
      *abc.host.com)
      local_server=("$machines[0]")
      ;;
      *def.host.com)
      local_server=("$machines[1]")
      ;;
      *pqr.host.com)
      local_server=("$machines[2]")
      ;;
      *) echo "unknown host: $(hostname -f), exiting." && exit 1 ;;
      # ?
      esac
      export local="$local_server"

      # deleting files before we start copying
      find "$primary" -maxdepth 1 -type f -exec rm -fv ;
      find "$secondary" -maxdepth 1 -type f -exec rm -fv ;

      do_copy() (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

      export -f do_copy
      # copy files in parallel
      parallel -j "$3" do_copy $primary ::: $pri[@] &
      parallel -j "$3" do_copy $secondary ::: $snd[@] &
      wait
      echo "all files copied"

      # this is for verification to see all files got copied or not
      # in primary and secondary folder
      set -- "$primary" "$secondary"
      typeset -n array
      for array in pri snd; do
      for num in "$array[@]"; do
      name="hello_monthly_$num_999_1.data"
      if [ ! -f "$1/$name" ]; then
      echo "$name" not found in "$1" >&2 && exit 1;
      fi
      done
      shift
      done

      date2=$(date +"%s")
      diff=$(($date2-$date1))
      echo "Total Time Taken - $(($diff / 3600)) hours and $(((diff/60) % 60)) minutes and $(($diff % 60)) seconds elapsed."


      Here is my host-mapping.txt file that will have lot more entries. Here, value is a valid hostname and key will be a string "k" followed by some number and that number should be there in the mapping files.



      k1=machineA.abc.com
      k2=machineB.abc.com
      k3=machineC.def.com
      k4=machineD.pqr.com
      k5=machineO.abc.com


      And here is my sample mapping files:



      primary_mappings.txt



      1=[343, 0, 686, 1372, 882, 196], 2=[687, 1, 1373, 883, 197, 736, 1030, 1569], 3=[1374, 2, 884, 737, 198, 1570], 4=[1375, 1032, 1424, 3, 885, 1228], 5=[1033, 1425, 4, 200, 886]


      secondary_mappings.txt



      1=[1152, 816, 1488, 336, 1008], 2=[1153, 0, 817, 337, 1489, 1009, 1297], 3=[1, 1154, 1490, 338], 4=[1155, 2, 339, 1491, 819, 1299, 1635], 5=[820, 1492, 340, 3, 1156]


      For example: clientid 1 has 343, 0, 686, 1372, 882, 196 primary files and 1152, 816, 1488, 336, 1008 secondary files, similarly for other clientids as well.









      share|improve this question












      share|improve this question




      share|improve this question








      edited Apr 18 at 3:50









      Jamal♦

      30.1k11114225




      30.1k11114225









      asked Apr 18 at 2:38









      user1950349

      1511418




      1511418




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          4
          down vote



          +25










          Here's a useless use of cat:



          prop_key=`cat $path/$hostfile | grep "$prop_value" | cut -d'=' -f1`


          That could easily be



          prop_key=$(grep -e "$prop_value" -- "$path/$hostfile" | cut -d= -f1)


          (I've fixed your quoting there, too)




          Send error messages to the standard error stream, not standard output:



          [ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)" >&2; exit 1; 


          I added >&2 there - it's worth making this into a function:



          die() 
          echo "$@" >&2
          exit 1



          You could insert logger -t "$0" "$*" in there if you wanted, for example.




          This long line is hard to follow:



          (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold1"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1) || (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)


          I'm not sure why it needs to be in a sub-shell, but assuming that's necessary, we can redirect the stdout and stderr of the entire sub-shell, which will shorten command. It might be worth identifying what changes between the two invocations with a simple for loop:



          scp_opts=(C -o StrictHostKeyChecking=no)

          (for hold in "$hold1" "$hold2"
          do scp "$scp_opts[@]"
          "goldy@$local:$hold/hello_monthly_$1_999_1.data" "$2/."
          && break
          done) >/dev/null 2>&1


          I've just used $1 and $2 directly - but if you want to give them names, I recommend declaring them both as local:



          local el primsec



          For the final formatting of seconds into hours, minutes and seconds, you can use date again (as you seem to be assuming GNU date), if the duration will be less than 24 hours:



          date -d @$diff '+Total Time Taken - %-H hours and %-M minutes and %-S seconds elapsed.'


          That still suffers from the "1 seconds" problem. You might prefer a simple numeric form:



          date -d @$diff '+Total Time Taken - %T'





          share|improve this answer























          • If you're just greping a file why use redirection?
            – chicks
            Apr 23 at 13:49










          • Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
            – Toby Speight
            Apr 23 at 13:50










          • I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
            – Gareth Rees
            Jun 1 at 9:25

















          up vote
          3
          down vote













          good



          • decent use of comments

          • using shell functions

          • good indentation

          • nice output to keep the user informed

          • decent variable names

          • not too complicated for a shell script. And the complexity is reasonably well compartmentalized.

          quick hints



          • use https://www.shellcheck.net/ to get helpful suggestions. In your case it is mostly just suggesting to use quotes in a few more places and skip the `` (backticks) form of command substitution. Since you're already using the more modern $() form of command substitution that will help make the code more consistent.

          • line comments usually look better with a blank line above them. Also, it'd be nice to comment the initial lines 2-11 as "defaults" or something.

          • usually shell functions are grouped together near the top of the script right after the variable assignments. If you want they could be in a separate file that gets sourced.

          • since you only call getProperty once to get the clientid why not include the error checking portion of things inside the function?

          better output and logging



          You mention wanting to log error messages, but I don't see anything that write outs to a log. This is easy enough to do with something like:



          LOGFILE=/tmp/myscript.log

          myerror()
          local msg="$1"
          echo "$(date) $msg" >> $LOGFILE
          echo "$msg" > &2 # stderr
          exit 1


          # sometime later...
          myerror "something weird happened"


          So, this will take your error message, display it to the user, and append it to a logfile, before exiting. You could easily extend this to include the PID ($$). If you like having the date/time in your output you could also create a date_echo or something that takes your message and puts the date on front of it.



          date_echo() 
          local msg="$1"
          # echo "$(date) $msg" >> $LOGFILE
          echo "$(date): $msg"






          share|improve this answer























          • I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
            – chicks
            Apr 23 at 13:45










          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "196"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );








           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f192343%2fcopy-files-in-parallel-by-reading-what-files-to-copy-from-few-other-files%23new-answer', 'question_page');

          );

          Post as a guest






























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          4
          down vote



          +25










          Here's a useless use of cat:



          prop_key=`cat $path/$hostfile | grep "$prop_value" | cut -d'=' -f1`


          That could easily be



          prop_key=$(grep -e "$prop_value" -- "$path/$hostfile" | cut -d= -f1)


          (I've fixed your quoting there, too)




          Send error messages to the standard error stream, not standard output:



          [ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)" >&2; exit 1; 


          I added >&2 there - it's worth making this into a function:



          die() 
          echo "$@" >&2
          exit 1



          You could insert logger -t "$0" "$*" in there if you wanted, for example.




          This long line is hard to follow:



          (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold1"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1) || (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)


          I'm not sure why it needs to be in a sub-shell, but assuming that's necessary, we can redirect the stdout and stderr of the entire sub-shell, which will shorten command. It might be worth identifying what changes between the two invocations with a simple for loop:



          scp_opts=(C -o StrictHostKeyChecking=no)

          (for hold in "$hold1" "$hold2"
          do scp "$scp_opts[@]"
          "goldy@$local:$hold/hello_monthly_$1_999_1.data" "$2/."
          && break
          done) >/dev/null 2>&1


          I've just used $1 and $2 directly - but if you want to give them names, I recommend declaring them both as local:



          local el primsec



          For the final formatting of seconds into hours, minutes and seconds, you can use date again (as you seem to be assuming GNU date), if the duration will be less than 24 hours:



          date -d @$diff '+Total Time Taken - %-H hours and %-M minutes and %-S seconds elapsed.'


          That still suffers from the "1 seconds" problem. You might prefer a simple numeric form:



          date -d @$diff '+Total Time Taken - %T'





          share|improve this answer























          • If you're just greping a file why use redirection?
            – chicks
            Apr 23 at 13:49










          • Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
            – Toby Speight
            Apr 23 at 13:50










          • I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
            – Gareth Rees
            Jun 1 at 9:25














          up vote
          4
          down vote



          +25










          Here's a useless use of cat:



          prop_key=`cat $path/$hostfile | grep "$prop_value" | cut -d'=' -f1`


          That could easily be



          prop_key=$(grep -e "$prop_value" -- "$path/$hostfile" | cut -d= -f1)


          (I've fixed your quoting there, too)




          Send error messages to the standard error stream, not standard output:



          [ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)" >&2; exit 1; 


          I added >&2 there - it's worth making this into a function:



          die() 
          echo "$@" >&2
          exit 1



          You could insert logger -t "$0" "$*" in there if you wanted, for example.




          This long line is hard to follow:



          (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold1"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1) || (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)


          I'm not sure why it needs to be in a sub-shell, but assuming that's necessary, we can redirect the stdout and stderr of the entire sub-shell, which will shorten command. It might be worth identifying what changes between the two invocations with a simple for loop:



          scp_opts=(C -o StrictHostKeyChecking=no)

          (for hold in "$hold1" "$hold2"
          do scp "$scp_opts[@]"
          "goldy@$local:$hold/hello_monthly_$1_999_1.data" "$2/."
          && break
          done) >/dev/null 2>&1


          I've just used $1 and $2 directly - but if you want to give them names, I recommend declaring them both as local:



          local el primsec



          For the final formatting of seconds into hours, minutes and seconds, you can use date again (as you seem to be assuming GNU date), if the duration will be less than 24 hours:



          date -d @$diff '+Total Time Taken - %-H hours and %-M minutes and %-S seconds elapsed.'


          That still suffers from the "1 seconds" problem. You might prefer a simple numeric form:



          date -d @$diff '+Total Time Taken - %T'





          share|improve this answer























          • If you're just greping a file why use redirection?
            – chicks
            Apr 23 at 13:49










          • Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
            – Toby Speight
            Apr 23 at 13:50










          • I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
            – Gareth Rees
            Jun 1 at 9:25












          up vote
          4
          down vote



          +25







          up vote
          4
          down vote



          +25




          +25




          Here's a useless use of cat:



          prop_key=`cat $path/$hostfile | grep "$prop_value" | cut -d'=' -f1`


          That could easily be



          prop_key=$(grep -e "$prop_value" -- "$path/$hostfile" | cut -d= -f1)


          (I've fixed your quoting there, too)




          Send error messages to the standard error stream, not standard output:



          [ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)" >&2; exit 1; 


          I added >&2 there - it's worth making this into a function:



          die() 
          echo "$@" >&2
          exit 1



          You could insert logger -t "$0" "$*" in there if you wanted, for example.




          This long line is hard to follow:



          (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold1"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1) || (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)


          I'm not sure why it needs to be in a sub-shell, but assuming that's necessary, we can redirect the stdout and stderr of the entire sub-shell, which will shorten command. It might be worth identifying what changes between the two invocations with a simple for loop:



          scp_opts=(C -o StrictHostKeyChecking=no)

          (for hold in "$hold1" "$hold2"
          do scp "$scp_opts[@]"
          "goldy@$local:$hold/hello_monthly_$1_999_1.data" "$2/."
          && break
          done) >/dev/null 2>&1


          I've just used $1 and $2 directly - but if you want to give them names, I recommend declaring them both as local:



          local el primsec



          For the final formatting of seconds into hours, minutes and seconds, you can use date again (as you seem to be assuming GNU date), if the duration will be less than 24 hours:



          date -d @$diff '+Total Time Taken - %-H hours and %-M minutes and %-S seconds elapsed.'


          That still suffers from the "1 seconds" problem. You might prefer a simple numeric form:



          date -d @$diff '+Total Time Taken - %T'





          share|improve this answer















          Here's a useless use of cat:



          prop_key=`cat $path/$hostfile | grep "$prop_value" | cut -d'=' -f1`


          That could easily be



          prop_key=$(grep -e "$prop_value" -- "$path/$hostfile" | cut -d= -f1)


          (I've fixed your quoting there, too)




          Send error messages to the standard error stream, not standard output:



          [ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)" >&2; exit 1; 


          I added >&2 there - it's worth making this into a function:



          die() 
          echo "$@" >&2
          exit 1



          You could insert logger -t "$0" "$*" in there if you wanted, for example.




          This long line is hard to follow:



          (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold1"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1) || (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)


          I'm not sure why it needs to be in a sub-shell, but assuming that's necessary, we can redirect the stdout and stderr of the entire sub-shell, which will shorten command. It might be worth identifying what changes between the two invocations with a simple for loop:



          scp_opts=(C -o StrictHostKeyChecking=no)

          (for hold in "$hold1" "$hold2"
          do scp "$scp_opts[@]"
          "goldy@$local:$hold/hello_monthly_$1_999_1.data" "$2/."
          && break
          done) >/dev/null 2>&1


          I've just used $1 and $2 directly - but if you want to give them names, I recommend declaring them both as local:



          local el primsec



          For the final formatting of seconds into hours, minutes and seconds, you can use date again (as you seem to be assuming GNU date), if the duration will be less than 24 hours:



          date -d @$diff '+Total Time Taken - %-H hours and %-M minutes and %-S seconds elapsed.'


          That still suffers from the "1 seconds" problem. You might prefer a simple numeric form:



          date -d @$diff '+Total Time Taken - %T'






          share|improve this answer















          share|improve this answer



          share|improve this answer








          edited Jun 1 at 9:26


























          answered Apr 23 at 8:44









          Toby Speight

          17.5k13489




          17.5k13489











          • If you're just greping a file why use redirection?
            – chicks
            Apr 23 at 13:49










          • Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
            – Toby Speight
            Apr 23 at 13:50










          • I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
            – Gareth Rees
            Jun 1 at 9:25
















          • If you're just greping a file why use redirection?
            – chicks
            Apr 23 at 13:49










          • Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
            – Toby Speight
            Apr 23 at 13:50










          • I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
            – Gareth Rees
            Jun 1 at 9:25















          If you're just greping a file why use redirection?
          – chicks
          Apr 23 at 13:49




          If you're just greping a file why use redirection?
          – chicks
          Apr 23 at 13:49












          Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
          – Toby Speight
          Apr 23 at 13:50




          Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
          – Toby Speight
          Apr 23 at 13:50












          I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
          – Gareth Rees
          Jun 1 at 9:25




          I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
          – Gareth Rees
          Jun 1 at 9:25












          up vote
          3
          down vote













          good



          • decent use of comments

          • using shell functions

          • good indentation

          • nice output to keep the user informed

          • decent variable names

          • not too complicated for a shell script. And the complexity is reasonably well compartmentalized.

          quick hints



          • use https://www.shellcheck.net/ to get helpful suggestions. In your case it is mostly just suggesting to use quotes in a few more places and skip the `` (backticks) form of command substitution. Since you're already using the more modern $() form of command substitution that will help make the code more consistent.

          • line comments usually look better with a blank line above them. Also, it'd be nice to comment the initial lines 2-11 as "defaults" or something.

          • usually shell functions are grouped together near the top of the script right after the variable assignments. If you want they could be in a separate file that gets sourced.

          • since you only call getProperty once to get the clientid why not include the error checking portion of things inside the function?

          better output and logging



          You mention wanting to log error messages, but I don't see anything that write outs to a log. This is easy enough to do with something like:



          LOGFILE=/tmp/myscript.log

          myerror()
          local msg="$1"
          echo "$(date) $msg" >> $LOGFILE
          echo "$msg" > &2 # stderr
          exit 1


          # sometime later...
          myerror "something weird happened"


          So, this will take your error message, display it to the user, and append it to a logfile, before exiting. You could easily extend this to include the PID ($$). If you like having the date/time in your output you could also create a date_echo or something that takes your message and puts the date on front of it.



          date_echo() 
          local msg="$1"
          # echo "$(date) $msg" >> $LOGFILE
          echo "$(date): $msg"






          share|improve this answer























          • I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
            – chicks
            Apr 23 at 13:45














          up vote
          3
          down vote













          good



          • decent use of comments

          • using shell functions

          • good indentation

          • nice output to keep the user informed

          • decent variable names

          • not too complicated for a shell script. And the complexity is reasonably well compartmentalized.

          quick hints



          • use https://www.shellcheck.net/ to get helpful suggestions. In your case it is mostly just suggesting to use quotes in a few more places and skip the `` (backticks) form of command substitution. Since you're already using the more modern $() form of command substitution that will help make the code more consistent.

          • line comments usually look better with a blank line above them. Also, it'd be nice to comment the initial lines 2-11 as "defaults" or something.

          • usually shell functions are grouped together near the top of the script right after the variable assignments. If you want they could be in a separate file that gets sourced.

          • since you only call getProperty once to get the clientid why not include the error checking portion of things inside the function?

          better output and logging



          You mention wanting to log error messages, but I don't see anything that write outs to a log. This is easy enough to do with something like:



          LOGFILE=/tmp/myscript.log

          myerror()
          local msg="$1"
          echo "$(date) $msg" >> $LOGFILE
          echo "$msg" > &2 # stderr
          exit 1


          # sometime later...
          myerror "something weird happened"


          So, this will take your error message, display it to the user, and append it to a logfile, before exiting. You could easily extend this to include the PID ($$). If you like having the date/time in your output you could also create a date_echo or something that takes your message and puts the date on front of it.



          date_echo() 
          local msg="$1"
          # echo "$(date) $msg" >> $LOGFILE
          echo "$(date): $msg"






          share|improve this answer























          • I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
            – chicks
            Apr 23 at 13:45












          up vote
          3
          down vote










          up vote
          3
          down vote









          good



          • decent use of comments

          • using shell functions

          • good indentation

          • nice output to keep the user informed

          • decent variable names

          • not too complicated for a shell script. And the complexity is reasonably well compartmentalized.

          quick hints



          • use https://www.shellcheck.net/ to get helpful suggestions. In your case it is mostly just suggesting to use quotes in a few more places and skip the `` (backticks) form of command substitution. Since you're already using the more modern $() form of command substitution that will help make the code more consistent.

          • line comments usually look better with a blank line above them. Also, it'd be nice to comment the initial lines 2-11 as "defaults" or something.

          • usually shell functions are grouped together near the top of the script right after the variable assignments. If you want they could be in a separate file that gets sourced.

          • since you only call getProperty once to get the clientid why not include the error checking portion of things inside the function?

          better output and logging



          You mention wanting to log error messages, but I don't see anything that write outs to a log. This is easy enough to do with something like:



          LOGFILE=/tmp/myscript.log

          myerror()
          local msg="$1"
          echo "$(date) $msg" >> $LOGFILE
          echo "$msg" > &2 # stderr
          exit 1


          # sometime later...
          myerror "something weird happened"


          So, this will take your error message, display it to the user, and append it to a logfile, before exiting. You could easily extend this to include the PID ($$). If you like having the date/time in your output you could also create a date_echo or something that takes your message and puts the date on front of it.



          date_echo() 
          local msg="$1"
          # echo "$(date) $msg" >> $LOGFILE
          echo "$(date): $msg"






          share|improve this answer















          good



          • decent use of comments

          • using shell functions

          • good indentation

          • nice output to keep the user informed

          • decent variable names

          • not too complicated for a shell script. And the complexity is reasonably well compartmentalized.

          quick hints



          • use https://www.shellcheck.net/ to get helpful suggestions. In your case it is mostly just suggesting to use quotes in a few more places and skip the `` (backticks) form of command substitution. Since you're already using the more modern $() form of command substitution that will help make the code more consistent.

          • line comments usually look better with a blank line above them. Also, it'd be nice to comment the initial lines 2-11 as "defaults" or something.

          • usually shell functions are grouped together near the top of the script right after the variable assignments. If you want they could be in a separate file that gets sourced.

          • since you only call getProperty once to get the clientid why not include the error checking portion of things inside the function?

          better output and logging



          You mention wanting to log error messages, but I don't see anything that write outs to a log. This is easy enough to do with something like:



          LOGFILE=/tmp/myscript.log

          myerror()
          local msg="$1"
          echo "$(date) $msg" >> $LOGFILE
          echo "$msg" > &2 # stderr
          exit 1


          # sometime later...
          myerror "something weird happened"


          So, this will take your error message, display it to the user, and append it to a logfile, before exiting. You could easily extend this to include the PID ($$). If you like having the date/time in your output you could also create a date_echo or something that takes your message and puts the date on front of it.



          date_echo() 
          local msg="$1"
          # echo "$(date) $msg" >> $LOGFILE
          echo "$(date): $msg"







          share|improve this answer















          share|improve this answer



          share|improve this answer








          edited Apr 23 at 13:47


























          answered Apr 23 at 3:00









          chicks

          1,3792818




          1,3792818











          • I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
            – chicks
            Apr 23 at 13:45
















          • I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
            – chicks
            Apr 23 at 13:45















          I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
          – chicks
          Apr 23 at 13:45




          I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
          – chicks
          Apr 23 at 13:45












           

          draft saved


          draft discarded


























           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f192343%2fcopy-files-in-parallel-by-reading-what-files-to-copy-from-few-other-files%23new-answer', 'question_page');

          );

          Post as a guest













































































          Popular posts from this blog

          Greedy Best First Search implementation in Rust

          Function to Return a JSON Like Objects Using VBA Collections and Arrays

          C++11 CLH Lock Implementation