Copy files in parallel by reading what files to copy from few other files

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
7
down vote

favorite

I am working on a little complicated shell script for the first time and this is what it should do:

During startup, it figures out what is my clientid by looking at host-mapping.txt file. If I cannot find clientid for my hostname then I need to exit from the shell script with a non-zero status code and log error message.

Now once I have a valid clientid, I will extract primary files from the primary-mappings.txt file and secondary files from the secondary-mappings.txt file for that valid clientid. If for whatever reason, I cannot find either primary or secondary files for that clientid from that file, then I will exit from the shell script and log an error message.

Now once I have valid primary and secondary files for that clientid then I will start copying those files in parallel using gnu-parallel from local_server. All primary files will go to primary folder and all secondary files will go to secondary folder. If files are not there in hold1 folder on remote servers then it should be there in the hold2 folder.

Now once all the files are copied, I will verify at the end to make sure all the primary and secondary files are present for that clientid in those two folders, but if for whatever reason I cannot find those files, then I want to exit from the shell script with a message that tells me what files are missing.

My script does the job but I would like to see if there is any better or efficient way to do these things since this is my first time writing a little complicated script so I wanted to check this out. As of now, I don't have a mechanism to exit out of the shell script if I cannot find primary or secondary files for that clientid. I also don't have a mechanism to exit out of the shell script if during the verification phase some files are missing.

#!/bin/bash
path=/home/goldy/scripts
mapfiles=(primary-mappings.txt secondary-mappings.txt)
hostfile=host-mapping.txt
machines=(machine1769.abc.host.com proctek5461.def.host.com letyrs87541.pqr.host.com)
# folders on local box where to copy files
primary=/data01/primary
secondary=/data02/secondary
# folders on remote servers from where to copy files
export hold1=/data/snapshot/$1
export hold2=/data/snapshot/$2

date1=$(date +"%s")
# this will tell me what's my clientid given my current hostname
getProperty () grep "$prop_value" 
# if I can't find clientid for my hostname, then I will log a message 
# and exit out of shell script with non zero status code
clientid=$(getProperty)
[ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)"; exit 1; 

# now once I have valid clientid, then I will get primary and secondary mapping
# from the "host-mapping.txt" file
declare -a arr
mappingsByClientID () 
 id=$1 # 1 to 5
 file=$path/$mapfiles[$2] # 0 to 1
 arr=($(sed -r "s/.*b$id=[([^]]+).*/1/; s/,/ /g" $file))
 echo "$arr[@]"


# assign output of function to an array
pri=($(mappingsByClientID $clientid 0))
snd=($(mappingsByClientID $clientid 1))

echo "primary files: $pri[@]"
echo "secondary files: $snd[@]"

# figure out which machine you want to use to start copying files from
case $(hostname -f) in
 *abc.host.com)
 local_server=("$machines[0]")
 ;;
 *def.host.com)
 local_server=("$machines[1]")
 ;;
 *pqr.host.com)
 local_server=("$machines[2]")
 ;;
 *) echo "unknown host: $(hostname -f), exiting." && exit 1 ;;
 # ?
esac
export local="$local_server"

# deleting files before we start copying
find "$primary" -maxdepth 1 -type f -exec rm -fv ;
find "$secondary" -maxdepth 1 -type f -exec rm -fv ;

do_copy() (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

export -f do_copy
# copy files in parallel
parallel -j "$3" do_copy $primary ::: $pri[@] &
parallel -j "$3" do_copy $secondary ::: $snd[@] &
wait
echo "all files copied"

# this is for verification to see all files got copied or not
# in primary and secondary folder
set -- "$primary" "$secondary"
typeset -n array
for array in pri snd; do
 for num in "$array[@]"; do
 name="hello_monthly_$num_999_1.data"
 if [ ! -f "$1/$name" ]; then
 echo "$name" not found in "$1" >&2 && exit 1; 
 fi
 done
 shift
done

date2=$(date +"%s")
diff=$(($date2-$date1))
echo "Total Time Taken - $(($diff / 3600)) hours and $(((diff/60) % 60)) minutes and $(($diff % 60)) seconds elapsed."

Here is my host-mapping.txt file that will have lot more entries. Here, value is a valid hostname and key will be a string "k" followed by some number and that number should be there in the mapping files.

k1=machineA.abc.com
k2=machineB.abc.com
k3=machineC.def.com
k4=machineD.pqr.com
k5=machineO.abc.com

And here is my sample mapping files:

primary_mappings.txt

1=[343, 0, 686, 1372, 882, 196], 2=[687, 1, 1373, 883, 197, 736, 1030, 1569], 3=[1374, 2, 884, 737, 198, 1570], 4=[1375, 1032, 1424, 3, 885, 1228], 5=[1033, 1425, 4, 200, 886]

secondary_mappings.txt

1=[1152, 816, 1488, 336, 1008], 2=[1153, 0, 817, 337, 1489, 1009, 1297], 3=[1, 1154, 1490, 338], 4=[1155, 2, 339, 1491, 819, 1299, 1635], 5=[820, 1492, 340, 3, 1156]

For example: clientid 1 has 343, 0, 686, 1372, 882, 196 primary files and 1152, 816, 1488, 336, 1008 secondary files, similarly for other clientids as well.

edited Apr 18 at 3:50

Jamalâ™¦

30.1k11114225

asked Apr 18 at 2:38

user1950349

1511418

add a commentÂ |Â

up vote
7
down vote

favorite

I am working on a little complicated shell script for the first time and this is what it should do:

During startup, it figures out what is my clientid by looking at host-mapping.txt file. If I cannot find clientid for my hostname then I need to exit from the shell script with a non-zero status code and log error message.

Now once I have a valid clientid, I will extract primary files from the primary-mappings.txt file and secondary files from the secondary-mappings.txt file for that valid clientid. If for whatever reason, I cannot find either primary or secondary files for that clientid from that file, then I will exit from the shell script and log an error message.

Now once I have valid primary and secondary files for that clientid then I will start copying those files in parallel using gnu-parallel from local_server. All primary files will go to primary folder and all secondary files will go to secondary folder. If files are not there in hold1 folder on remote servers then it should be there in the hold2 folder.

Now once all the files are copied, I will verify at the end to make sure all the primary and secondary files are present for that clientid in those two folders, but if for whatever reason I cannot find those files, then I want to exit from the shell script with a message that tells me what files are missing.

#!/bin/bash
path=/home/goldy/scripts
mapfiles=(primary-mappings.txt secondary-mappings.txt)
hostfile=host-mapping.txt
machines=(machine1769.abc.host.com proctek5461.def.host.com letyrs87541.pqr.host.com)
# folders on local box where to copy files
primary=/data01/primary
secondary=/data02/secondary
# folders on remote servers from where to copy files
export hold1=/data/snapshot/$1
export hold2=/data/snapshot/$2

date1=$(date +"%s")
# this will tell me what's my clientid given my current hostname
getProperty () grep "$prop_value" 
# if I can't find clientid for my hostname, then I will log a message 
# and exit out of shell script with non zero status code
clientid=$(getProperty)
[ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)"; exit 1; 

# now once I have valid clientid, then I will get primary and secondary mapping
# from the "host-mapping.txt" file
declare -a arr
mappingsByClientID () 
 id=$1 # 1 to 5
 file=$path/$mapfiles[$2] # 0 to 1
 arr=($(sed -r "s/.*b$id=[([^]]+).*/1/; s/,/ /g" $file))
 echo "$arr[@]"


# assign output of function to an array
pri=($(mappingsByClientID $clientid 0))
snd=($(mappingsByClientID $clientid 1))

echo "primary files: $pri[@]"
echo "secondary files: $snd[@]"

# figure out which machine you want to use to start copying files from
case $(hostname -f) in
 *abc.host.com)
 local_server=("$machines[0]")
 ;;
 *def.host.com)
 local_server=("$machines[1]")
 ;;
 *pqr.host.com)
 local_server=("$machines[2]")
 ;;
 *) echo "unknown host: $(hostname -f), exiting." && exit 1 ;;
 # ?
esac
export local="$local_server"

# deleting files before we start copying
find "$primary" -maxdepth 1 -type f -exec rm -fv ;
find "$secondary" -maxdepth 1 -type f -exec rm -fv ;

do_copy() (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

export -f do_copy
# copy files in parallel
parallel -j "$3" do_copy $primary ::: $pri[@] &
parallel -j "$3" do_copy $secondary ::: $snd[@] &
wait
echo "all files copied"

# this is for verification to see all files got copied or not
# in primary and secondary folder
set -- "$primary" "$secondary"
typeset -n array
for array in pri snd; do
 for num in "$array[@]"; do
 name="hello_monthly_$num_999_1.data"
 if [ ! -f "$1/$name" ]; then
 echo "$name" not found in "$1" >&2 && exit 1; 
 fi
 done
 shift
done

date2=$(date +"%s")
diff=$(($date2-$date1))
echo "Total Time Taken - $(($diff / 3600)) hours and $(((diff/60) % 60)) minutes and $(($diff % 60)) seconds elapsed."

k1=machineA.abc.com
k2=machineB.abc.com
k3=machineC.def.com
k4=machineD.pqr.com
k5=machineO.abc.com

And here is my sample mapping files:

primary_mappings.txt

1=[343, 0, 686, 1372, 882, 196], 2=[687, 1, 1373, 883, 197, 736, 1030, 1569], 3=[1374, 2, 884, 737, 198, 1570], 4=[1375, 1032, 1424, 3, 885, 1228], 5=[1033, 1425, 4, 200, 886]

secondary_mappings.txt

1=[1152, 816, 1488, 336, 1008], 2=[1153, 0, 817, 337, 1489, 1009, 1297], 3=[1, 1154, 1490, 338], 4=[1155, 2, 339, 1491, 819, 1299, 1635], 5=[820, 1492, 340, 3, 1156]

For example: clientid 1 has 343, 0, 686, 1372, 882, 196 primary files and 1152, 816, 1488, 336, 1008 secondary files, similarly for other clientids as well.

edited Apr 18 at 3:50

Jamalâ™¦

30.1k11114225

asked Apr 18 at 2:38

user1950349

1511418

add a commentÂ |Â

up vote
7
down vote

favorite

I am working on a little complicated shell script for the first time and this is what it should do:

During startup, it figures out what is my clientid by looking at host-mapping.txt file. If I cannot find clientid for my hostname then I need to exit from the shell script with a non-zero status code and log error message.

Now once I have a valid clientid, I will extract primary files from the primary-mappings.txt file and secondary files from the secondary-mappings.txt file for that valid clientid. If for whatever reason, I cannot find either primary or secondary files for that clientid from that file, then I will exit from the shell script and log an error message.

Now once I have valid primary and secondary files for that clientid then I will start copying those files in parallel using gnu-parallel from local_server. All primary files will go to primary folder and all secondary files will go to secondary folder. If files are not there in hold1 folder on remote servers then it should be there in the hold2 folder.

Now once all the files are copied, I will verify at the end to make sure all the primary and secondary files are present for that clientid in those two folders, but if for whatever reason I cannot find those files, then I want to exit from the shell script with a message that tells me what files are missing.

#!/bin/bash
path=/home/goldy/scripts
mapfiles=(primary-mappings.txt secondary-mappings.txt)
hostfile=host-mapping.txt
machines=(machine1769.abc.host.com proctek5461.def.host.com letyrs87541.pqr.host.com)
# folders on local box where to copy files
primary=/data01/primary
secondary=/data02/secondary
# folders on remote servers from where to copy files
export hold1=/data/snapshot/$1
export hold2=/data/snapshot/$2

date1=$(date +"%s")
# this will tell me what's my clientid given my current hostname
getProperty () grep "$prop_value" 
# if I can't find clientid for my hostname, then I will log a message 
# and exit out of shell script with non zero status code
clientid=$(getProperty)
[ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)"; exit 1; 

# now once I have valid clientid, then I will get primary and secondary mapping
# from the "host-mapping.txt" file
declare -a arr
mappingsByClientID () 
 id=$1 # 1 to 5
 file=$path/$mapfiles[$2] # 0 to 1
 arr=($(sed -r "s/.*b$id=[([^]]+).*/1/; s/,/ /g" $file))
 echo "$arr[@]"


# assign output of function to an array
pri=($(mappingsByClientID $clientid 0))
snd=($(mappingsByClientID $clientid 1))

echo "primary files: $pri[@]"
echo "secondary files: $snd[@]"

# figure out which machine you want to use to start copying files from
case $(hostname -f) in
 *abc.host.com)
 local_server=("$machines[0]")
 ;;
 *def.host.com)
 local_server=("$machines[1]")
 ;;
 *pqr.host.com)
 local_server=("$machines[2]")
 ;;
 *) echo "unknown host: $(hostname -f), exiting." && exit 1 ;;
 # ?
esac
export local="$local_server"

# deleting files before we start copying
find "$primary" -maxdepth 1 -type f -exec rm -fv ;
find "$secondary" -maxdepth 1 -type f -exec rm -fv ;

do_copy() (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

export -f do_copy
# copy files in parallel
parallel -j "$3" do_copy $primary ::: $pri[@] &
parallel -j "$3" do_copy $secondary ::: $snd[@] &
wait
echo "all files copied"

# this is for verification to see all files got copied or not
# in primary and secondary folder
set -- "$primary" "$secondary"
typeset -n array
for array in pri snd; do
 for num in "$array[@]"; do
 name="hello_monthly_$num_999_1.data"
 if [ ! -f "$1/$name" ]; then
 echo "$name" not found in "$1" >&2 && exit 1; 
 fi
 done
 shift
done

date2=$(date +"%s")
diff=$(($date2-$date1))
echo "Total Time Taken - $(($diff / 3600)) hours and $(((diff/60) % 60)) minutes and $(($diff % 60)) seconds elapsed."

k1=machineA.abc.com
k2=machineB.abc.com
k3=machineC.def.com
k4=machineD.pqr.com
k5=machineO.abc.com

And here is my sample mapping files:

primary_mappings.txt

1=[343, 0, 686, 1372, 882, 196], 2=[687, 1, 1373, 883, 197, 736, 1030, 1569], 3=[1374, 2, 884, 737, 198, 1570], 4=[1375, 1032, 1424, 3, 885, 1228], 5=[1033, 1425, 4, 200, 886]

secondary_mappings.txt

1=[1152, 816, 1488, 336, 1008], 2=[1153, 0, 817, 337, 1489, 1009, 1297], 3=[1, 1154, 1490, 338], 4=[1155, 2, 339, 1491, 819, 1299, 1635], 5=[820, 1492, 340, 3, 1156]

For example: clientid 1 has 343, 0, 686, 1372, 882, 196 primary files and 1152, 816, 1488, 336, 1008 secondary files, similarly for other clientids as well.

edited Apr 18 at 3:50

Jamalâ™¦

30.1k11114225

asked Apr 18 at 2:38

user1950349

1511418

I am working on a little complicated shell script for the first time and this is what it should do:

During startup, it figures out what is my clientid by looking at host-mapping.txt file. If I cannot find clientid for my hostname then I need to exit from the shell script with a non-zero status code and log error message.

Now once I have a valid clientid, I will extract primary files from the primary-mappings.txt file and secondary files from the secondary-mappings.txt file for that valid clientid. If for whatever reason, I cannot find either primary or secondary files for that clientid from that file, then I will exit from the shell script and log an error message.

Now once I have valid primary and secondary files for that clientid then I will start copying those files in parallel using gnu-parallel from local_server. All primary files will go to primary folder and all secondary files will go to secondary folder. If files are not there in hold1 folder on remote servers then it should be there in the hold2 folder.

Now once all the files are copied, I will verify at the end to make sure all the primary and secondary files are present for that clientid in those two folders, but if for whatever reason I cannot find those files, then I want to exit from the shell script with a message that tells me what files are missing.

#!/bin/bash
path=/home/goldy/scripts
mapfiles=(primary-mappings.txt secondary-mappings.txt)
hostfile=host-mapping.txt
machines=(machine1769.abc.host.com proctek5461.def.host.com letyrs87541.pqr.host.com)
# folders on local box where to copy files
primary=/data01/primary
secondary=/data02/secondary
# folders on remote servers from where to copy files
export hold1=/data/snapshot/$1
export hold2=/data/snapshot/$2

date1=$(date +"%s")
# this will tell me what's my clientid given my current hostname
getProperty () grep "$prop_value" 
# if I can't find clientid for my hostname, then I will log a message 
# and exit out of shell script with non zero status code
clientid=$(getProperty)
[ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)"; exit 1; 

# now once I have valid clientid, then I will get primary and secondary mapping
# from the "host-mapping.txt" file
declare -a arr
mappingsByClientID () 
 id=$1 # 1 to 5
 file=$path/$mapfiles[$2] # 0 to 1
 arr=($(sed -r "s/.*b$id=[([^]]+).*/1/; s/,/ /g" $file))
 echo "$arr[@]"


# assign output of function to an array
pri=($(mappingsByClientID $clientid 0))
snd=($(mappingsByClientID $clientid 1))

echo "primary files: $pri[@]"
echo "secondary files: $snd[@]"

# figure out which machine you want to use to start copying files from
case $(hostname -f) in
 *abc.host.com)
 local_server=("$machines[0]")
 ;;
 *def.host.com)
 local_server=("$machines[1]")
 ;;
 *pqr.host.com)
 local_server=("$machines[2]")
 ;;
 *) echo "unknown host: $(hostname -f), exiting." && exit 1 ;;
 # ?
esac
export local="$local_server"

# deleting files before we start copying
find "$primary" -maxdepth 1 -type f -exec rm -fv ;
find "$secondary" -maxdepth 1 -type f -exec rm -fv ;

do_copy() (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

export -f do_copy
# copy files in parallel
parallel -j "$3" do_copy $primary ::: $pri[@] &
parallel -j "$3" do_copy $secondary ::: $snd[@] &
wait
echo "all files copied"

# this is for verification to see all files got copied or not
# in primary and secondary folder
set -- "$primary" "$secondary"
typeset -n array
for array in pri snd; do
 for num in "$array[@]"; do
 name="hello_monthly_$num_999_1.data"
 if [ ! -f "$1/$name" ]; then
 echo "$name" not found in "$1" >&2 && exit 1; 
 fi
 done
 shift
done

date2=$(date +"%s")
diff=$(($date2-$date1))
echo "Total Time Taken - $(($diff / 3600)) hours and $(((diff/60) % 60)) minutes and $(($diff % 60)) seconds elapsed."

k1=machineA.abc.com
k2=machineB.abc.com
k3=machineC.def.com
k4=machineD.pqr.com
k5=machineO.abc.com

And here is my sample mapping files:

primary_mappings.txt

1=[343, 0, 686, 1372, 882, 196], 2=[687, 1, 1373, 883, 197, 736, 1030, 1569], 3=[1374, 2, 884, 737, 198, 1570], 4=[1375, 1032, 1424, 3, 885, 1228], 5=[1033, 1425, 4, 200, 886]

secondary_mappings.txt

1=[1152, 816, 1488, 336, 1008], 2=[1153, 0, 817, 337, 1489, 1009, 1297], 3=[1, 1154, 1490, 338], 4=[1155, 2, 339, 1491, 819, 1299, 1635], 5=[820, 1492, 340, 3, 1156]

For example: clientid 1 has 343, 0, 686, 1372, 882, 196 primary files and 1152, 816, 1488, 336, 1008 secondary files, similarly for other clientids as well.

edited Apr 18 at 3:50

Jamalâ™¦

30.1k11114225

asked Apr 18 at 2:38

user1950349

1511418

edited Apr 18 at 3:50

Jamalâ™¦

30.1k11114225

edited Apr 18 at 3:50

Jamalâ™¦

30.1k11114225

edited Apr 18 at 3:50

Jamalâ™¦

30.1k11114225

asked Apr 18 at 2:38

user1950349

1511418

asked Apr 18 at 2:38

user1950349

1511418

asked Apr 18 at 2:38

user1950349

1511418

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
4
down vote

+25

Here's a useless use of cat:

prop_key=`cat $path/$hostfile | grep "$prop_value" | cut -d'=' -f1`

That could easily be

prop_key=$(grep -e "$prop_value" -- "$path/$hostfile" | cut -d= -f1)

(I've fixed your quoting there, too)

Send error messages to the standard error stream, not standard output:

[ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)" >&2; exit 1;

I added >&2 there - it's worth making this into a function:

die() 
 echo "$@" >&2
 exit 1

You could insert logger -t "$0" "$*" in there if you wanted, for example.

This long line is hard to follow:

(scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold1"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1) || (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

I'm not sure why it needs to be in a sub-shell, but assuming that's necessary, we can redirect the stdout and stderr of the entire sub-shell, which will shorten command. It might be worth identifying what changes between the two invocations with a simple for loop:

scp_opts=(C -o StrictHostKeyChecking=no)

(for hold in "$hold1" "$hold2"
 do scp "$scp_opts[@]" 
 "goldy@$local:$hold/hello_monthly_$1_999_1.data" "$2/." 
 && break
 done) >/dev/null 2>&1

I've just used $1 and $2 directly - but if you want to give them names, I recommend declaring them both as local:

local el primsec

For the final formatting of seconds into hours, minutes and seconds, you can use date again (as you seem to be assuming GNU date), if the duration will be less than 24 hours:

date -d @$diff '+Total Time Taken - %-H hours and %-M minutes and %-S seconds elapsed.'

That still suffers from the "1 seconds" problem. You might prefer a simple numeric form:

date -d @$diff '+Total Time Taken - %T'

edited Jun 1 at 9:26

answered Apr 23 at 8:44

Toby Speight

17.5k13489

If you're just greping a file why use redirection?
â€“Â chicks
Apr 23 at 13:49

Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
â€“Â Toby Speight
Apr 23 at 13:50

I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
â€“Â Gareth Rees
Jun 1 at 9:25

add a commentÂ |Â

up vote
3
down vote

good

decent use of comments

using shell functions

good indentation

nice output to keep the user informed

decent variable names

not too complicated for a shell script. And the complexity is reasonably well compartmentalized.

quick hints

use https://www.shellcheck.net/ to get helpful suggestions. In your case it is mostly just suggesting to use quotes in a few more places and skip the `` (backticks) form of command substitution. Since you're already using the more modern $() form of command substitution that will help make the code more consistent.

line comments usually look better with a blank line above them. Also, it'd be nice to comment the initial lines 2-11 as "defaults" or something.

usually shell functions are grouped together near the top of the script right after the variable assignments. If you want they could be in a separate file that gets sourced.

since you only call getProperty once to get the clientid why not include the error checking portion of things inside the function?

better output and logging

You mention wanting to log error messages, but I don't see anything that write outs to a log. This is easy enough to do with something like:

LOGFILE=/tmp/myscript.log

myerror() 
 local msg="$1"
 echo "$(date) $msg" >> $LOGFILE
 echo "$msg" > &2 # stderr
 exit 1


# sometime later...
myerror "something weird happened"

So, this will take your error message, display it to the user, and append it to a logfile, before exiting. You could easily extend this to include the PID ($$). If you like having the date/time in your output you could also create a date_echo or something that takes your message and puts the date on front of it.

date_echo() 
 local msg="$1"
 # echo "$(date) $msg" >> $LOGFILE
 echo "$(date): $msg"

edited Apr 23 at 13:47

answered Apr 23 at 3:00

chicks

1,3792818

I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
â€“Â chicks
Apr 23 at 13:45

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f192343%2fcopy-files-in-parallel-by-reading-what-files-to-copy-from-few-other-files%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
4
down vote

+25

Here's a useless use of cat:

prop_key=`cat $path/$hostfile | grep "$prop_value" | cut -d'=' -f1`

That could easily be

prop_key=$(grep -e "$prop_value" -- "$path/$hostfile" | cut -d= -f1)

(I've fixed your quoting there, too)

Send error messages to the standard error stream, not standard output:

[ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)" >&2; exit 1;

I added >&2 there - it's worth making this into a function:

die() 
 echo "$@" >&2
 exit 1

You could insert logger -t "$0" "$*" in there if you wanted, for example.

This long line is hard to follow:

(scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold1"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1) || (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

scp_opts=(C -o StrictHostKeyChecking=no)

(for hold in "$hold1" "$hold2"
 do scp "$scp_opts[@]" 
 "goldy@$local:$hold/hello_monthly_$1_999_1.data" "$2/." 
 && break
 done) >/dev/null 2>&1

I've just used $1 and $2 directly - but if you want to give them names, I recommend declaring them both as local:

local el primsec

For the final formatting of seconds into hours, minutes and seconds, you can use date again (as you seem to be assuming GNU date), if the duration will be less than 24 hours:

date -d @$diff '+Total Time Taken - %-H hours and %-M minutes and %-S seconds elapsed.'

That still suffers from the "1 seconds" problem. You might prefer a simple numeric form:

date -d @$diff '+Total Time Taken - %T'

edited Jun 1 at 9:26

answered Apr 23 at 8:44

Toby Speight

17.5k13489

If you're just greping a file why use redirection?
â€“Â chicks
Apr 23 at 13:49

Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
â€“Â Toby Speight
Apr 23 at 13:50

I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
â€“Â Gareth Rees
Jun 1 at 9:25

add a commentÂ |Â

up vote
4
down vote

+25

Here's a useless use of cat:

prop_key=`cat $path/$hostfile | grep "$prop_value" | cut -d'=' -f1`

That could easily be

prop_key=$(grep -e "$prop_value" -- "$path/$hostfile" | cut -d= -f1)

(I've fixed your quoting there, too)

Send error messages to the standard error stream, not standard output:

[ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)" >&2; exit 1;

I added >&2 there - it's worth making this into a function:

die() 
 echo "$@" >&2
 exit 1

You could insert logger -t "$0" "$*" in there if you wanted, for example.

This long line is hard to follow:

(scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold1"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1) || (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

scp_opts=(C -o StrictHostKeyChecking=no)

(for hold in "$hold1" "$hold2"
 do scp "$scp_opts[@]" 
 "goldy@$local:$hold/hello_monthly_$1_999_1.data" "$2/." 
 && break
 done) >/dev/null 2>&1

I've just used $1 and $2 directly - but if you want to give them names, I recommend declaring them both as local:

local el primsec

For the final formatting of seconds into hours, minutes and seconds, you can use date again (as you seem to be assuming GNU date), if the duration will be less than 24 hours:

date -d @$diff '+Total Time Taken - %-H hours and %-M minutes and %-S seconds elapsed.'

That still suffers from the "1 seconds" problem. You might prefer a simple numeric form:

date -d @$diff '+Total Time Taken - %T'

edited Jun 1 at 9:26

answered Apr 23 at 8:44

Toby Speight

17.5k13489

If you're just greping a file why use redirection?
â€“Â chicks
Apr 23 at 13:49

Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
â€“Â Toby Speight
Apr 23 at 13:50

I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
â€“Â Gareth Rees
Jun 1 at 9:25

add a commentÂ |Â

up vote
4
down vote

+25

up vote
4
down vote

+25

Here's a useless use of cat:

prop_key=`cat $path/$hostfile | grep "$prop_value" | cut -d'=' -f1`

That could easily be

prop_key=$(grep -e "$prop_value" -- "$path/$hostfile" | cut -d= -f1)

(I've fixed your quoting there, too)

Send error messages to the standard error stream, not standard output:

[ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)" >&2; exit 1;

I added >&2 there - it's worth making this into a function:

die() 
 echo "$@" >&2
 exit 1

You could insert logger -t "$0" "$*" in there if you wanted, for example.

This long line is hard to follow:

(scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold1"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1) || (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

scp_opts=(C -o StrictHostKeyChecking=no)

(for hold in "$hold1" "$hold2"
 do scp "$scp_opts[@]" 
 "goldy@$local:$hold/hello_monthly_$1_999_1.data" "$2/." 
 && break
 done) >/dev/null 2>&1

I've just used $1 and $2 directly - but if you want to give them names, I recommend declaring them both as local:

local el primsec

For the final formatting of seconds into hours, minutes and seconds, you can use date again (as you seem to be assuming GNU date), if the duration will be less than 24 hours:

date -d @$diff '+Total Time Taken - %-H hours and %-M minutes and %-S seconds elapsed.'

That still suffers from the "1 seconds" problem. You might prefer a simple numeric form:

date -d @$diff '+Total Time Taken - %T'

edited Jun 1 at 9:26

answered Apr 23 at 8:44

Toby Speight

17.5k13489

Here's a useless use of cat:

prop_key=`cat $path/$hostfile | grep "$prop_value" | cut -d'=' -f1`

That could easily be

prop_key=$(grep -e "$prop_value" -- "$path/$hostfile" | cut -d= -f1)

(I've fixed your quoting there, too)

Send error messages to the standard error stream, not standard output:

[ -z "$clientid" ] && echo "cannot find clientid for $(hostname -f)" >&2; exit 1;

I added >&2 there - it's worth making this into a function:

die() 
 echo "$@" >&2
 exit 1

You could insert logger -t "$0" "$*" in there if you wanted, for example.

This long line is hard to follow:

(scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold1"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1) || (scp -C -o StrictHostKeyChecking=no goldy@"$local":"$hold2"/hello_monthly_"$el"_999_1.data "$primsec"/. > /dev/null 2>&1)

scp_opts=(C -o StrictHostKeyChecking=no)

(for hold in "$hold1" "$hold2"
 do scp "$scp_opts[@]" 
 "goldy@$local:$hold/hello_monthly_$1_999_1.data" "$2/." 
 && break
 done) >/dev/null 2>&1

I've just used $1 and $2 directly - but if you want to give them names, I recommend declaring them both as local:

local el primsec

For the final formatting of seconds into hours, minutes and seconds, you can use date again (as you seem to be assuming GNU date), if the duration will be less than 24 hours:

date -d @$diff '+Total Time Taken - %-H hours and %-M minutes and %-S seconds elapsed.'

That still suffers from the "1 seconds" problem. You might prefer a simple numeric form:

date -d @$diff '+Total Time Taken - %T'

edited Jun 1 at 9:26

answered Apr 23 at 8:44

Toby Speight

17.5k13489

edited Jun 1 at 9:26

answered Apr 23 at 8:44

Toby Speight

17.5k13489

answered Apr 23 at 8:44

Toby Speight

17.5k13489

answered Apr 23 at 8:44

Toby Speight

17.5k13489

If you're just greping a file why use redirection?
â€“Â chicks
Apr 23 at 13:49

Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
â€“Â Toby Speight
Apr 23 at 13:50

I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
â€“Â Gareth Rees
Jun 1 at 9:25

add a commentÂ |Â

If you're just greping a file why use redirection?
â€“Â chicks
Apr 23 at 13:49

Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
â€“Â Toby Speight
Apr 23 at 13:50

I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
â€“Â Gareth Rees
Jun 1 at 9:25

If you're just greping a file why use redirection?
â€“Â chicks
Apr 23 at 13:49

Thanks @chicks - I did mean to let grep open its own file. I've edited accordingly.
â€“Â Toby Speight
Apr 23 at 13:50

I recommend grep -e "$prop_value" -- "$path/$hostfile" just in case $prop_value or $path starts with a hyphen.
â€“Â Gareth Rees
Jun 1 at 9:25

add a commentÂ |Â

up vote
3
down vote

good

decent use of comments

using shell functions

good indentation

nice output to keep the user informed

decent variable names

not too complicated for a shell script. And the complexity is reasonably well compartmentalized.

quick hints

use https://www.shellcheck.net/ to get helpful suggestions. In your case it is mostly just suggesting to use quotes in a few more places and skip the `` (backticks) form of command substitution. Since you're already using the more modern $() form of command substitution that will help make the code more consistent.

line comments usually look better with a blank line above them. Also, it'd be nice to comment the initial lines 2-11 as "defaults" or something.

usually shell functions are grouped together near the top of the script right after the variable assignments. If you want they could be in a separate file that gets sourced.

since you only call getProperty once to get the clientid why not include the error checking portion of things inside the function?

better output and logging

You mention wanting to log error messages, but I don't see anything that write outs to a log. This is easy enough to do with something like:

LOGFILE=/tmp/myscript.log

myerror() 
 local msg="$1"
 echo "$(date) $msg" >> $LOGFILE
 echo "$msg" > &2 # stderr
 exit 1


# sometime later...
myerror "something weird happened"

date_echo() 
 local msg="$1"
 # echo "$(date) $msg" >> $LOGFILE
 echo "$(date): $msg"

edited Apr 23 at 13:47

answered Apr 23 at 3:00

chicks

1,3792818

I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
â€“Â chicks
Apr 23 at 13:45

add a commentÂ |Â

up vote
3
down vote

good

decent use of comments

using shell functions

good indentation

nice output to keep the user informed

decent variable names

not too complicated for a shell script. And the complexity is reasonably well compartmentalized.

quick hints

use https://www.shellcheck.net/ to get helpful suggestions. In your case it is mostly just suggesting to use quotes in a few more places and skip the `` (backticks) form of command substitution. Since you're already using the more modern $() form of command substitution that will help make the code more consistent.

line comments usually look better with a blank line above them. Also, it'd be nice to comment the initial lines 2-11 as "defaults" or something.

usually shell functions are grouped together near the top of the script right after the variable assignments. If you want they could be in a separate file that gets sourced.

since you only call getProperty once to get the clientid why not include the error checking portion of things inside the function?

better output and logging

You mention wanting to log error messages, but I don't see anything that write outs to a log. This is easy enough to do with something like:

LOGFILE=/tmp/myscript.log

myerror() 
 local msg="$1"
 echo "$(date) $msg" >> $LOGFILE
 echo "$msg" > &2 # stderr
 exit 1


# sometime later...
myerror "something weird happened"

date_echo() 
 local msg="$1"
 # echo "$(date) $msg" >> $LOGFILE
 echo "$(date): $msg"

edited Apr 23 at 13:47

answered Apr 23 at 3:00

chicks

1,3792818

I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
â€“Â chicks
Apr 23 at 13:45

add a commentÂ |Â

up vote
3
down vote

good

decent use of comments

using shell functions

good indentation

nice output to keep the user informed

decent variable names

not too complicated for a shell script. And the complexity is reasonably well compartmentalized.

quick hints

use https://www.shellcheck.net/ to get helpful suggestions. In your case it is mostly just suggesting to use quotes in a few more places and skip the `` (backticks) form of command substitution. Since you're already using the more modern $() form of command substitution that will help make the code more consistent.

line comments usually look better with a blank line above them. Also, it'd be nice to comment the initial lines 2-11 as "defaults" or something.

usually shell functions are grouped together near the top of the script right after the variable assignments. If you want they could be in a separate file that gets sourced.

since you only call getProperty once to get the clientid why not include the error checking portion of things inside the function?

better output and logging

You mention wanting to log error messages, but I don't see anything that write outs to a log. This is easy enough to do with something like:

LOGFILE=/tmp/myscript.log

myerror() 
 local msg="$1"
 echo "$(date) $msg" >> $LOGFILE
 echo "$msg" > &2 # stderr
 exit 1


# sometime later...
myerror "something weird happened"

date_echo() 
 local msg="$1"
 # echo "$(date) $msg" >> $LOGFILE
 echo "$(date): $msg"

edited Apr 23 at 13:47

answered Apr 23 at 3:00

chicks

1,3792818

good

decent use of comments

using shell functions

good indentation

nice output to keep the user informed

decent variable names

not too complicated for a shell script. And the complexity is reasonably well compartmentalized.

quick hints

use https://www.shellcheck.net/ to get helpful suggestions. In your case it is mostly just suggesting to use quotes in a few more places and skip the `` (backticks) form of command substitution. Since you're already using the more modern $() form of command substitution that will help make the code more consistent.

line comments usually look better with a blank line above them. Also, it'd be nice to comment the initial lines 2-11 as "defaults" or something.

usually shell functions are grouped together near the top of the script right after the variable assignments. If you want they could be in a separate file that gets sourced.

since you only call getProperty once to get the clientid why not include the error checking portion of things inside the function?

better output and logging

You mention wanting to log error messages, but I don't see anything that write outs to a log. This is easy enough to do with something like:

LOGFILE=/tmp/myscript.log

myerror() 
 local msg="$1"
 echo "$(date) $msg" >> $LOGFILE
 echo "$msg" > &2 # stderr
 exit 1


# sometime later...
myerror "something weird happened"

date_echo() 
 local msg="$1"
 # echo "$(date) $msg" >> $LOGFILE
 echo "$(date): $msg"

edited Apr 23 at 13:47

answered Apr 23 at 3:00

chicks

1,3792818

edited Apr 23 at 13:47

answered Apr 23 at 3:00

chicks

1,3792818

answered Apr 23 at 3:00

chicks

1,3792818

answered Apr 23 at 3:00

chicks

1,3792818

I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
â€“Â chicks
Apr 23 at 13:45

add a commentÂ |Â

I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
â€“Â chicks
Apr 23 at 13:45

I agree local would be an improvement so I changed that. I'm working on the stderr suggestion.
â€“Â chicks
Apr 23 at 13:45

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

搜尋此網誌

trjhtr

Copy files in parallel by reading what files to copy from few other files

2 Answers
2

good

quick hints

better output and logging

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

good

quick hints

better output and logging

good

quick hints

better output and logging

good

quick hints

better output and logging

good

quick hints

better output and logging

Post as a guest

Popular posts from this blog

Chat program with C++ and SFML

Function to Return a JSON Like Objects Using VBA Collections and Arrays

Will my employers contract hold up in court?

Copy files in parallel by reading what files to copy from few other files

2 Answers 2

good

quick hints

better output and logging

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

good

quick hints

better output and logging

good

quick hints

better output and logging

good

quick hints

better output and logging

good

quick hints

better output and logging

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Chat program with C++ and SFML

Function to Return a JSON Like Objects Using VBA Collections and Arrays

Will my employers contract hold up in court?

2 Answers
2

2 Answers
2

2 Answers
2