Rsync
Rsync and Parallel to copy files
Lightning fast (two steps/commands)
1. Make a text file having path of files to be copied
find ./<folder_having_all_the_files_to_be_copied> -type f > <location_with_name_of_text_file>.txt
Example command: find ./downloaded_files/items_to_move -type f > to_copy.txt
2. Rsync with parallel processing to copy
time cat <previously_made_text_file>.txt | parallel -j <num_of_cores_to_use> -X -n 1 rsync -av --inplace {} ./<destination_folder_to_copy_the_files>
Example command: time cat to_copy.txt | parallel -j 10 -X -n 1 rsync -av --inplace {} ./Receiver/
So you will be doing
find ./downloaded_files/items_to_move -type f > to_copy.txt
time cat to_copy.txt | parallel -j 90 -X -n 1 rsync -av --inplace {} ./Receiver/
to copy a small/huge number of files (from 1 to more than 1 Million files)
3. To keep the folder structure while copying (files inside different folders)
- Add
--relative
flag with thecat
command
** --relative
tells rsync to recreate the directory structure based on the file path
time cat to_copy.txt | parallel -j 10 -X -n 1 rsync -av --inplace --relative {} ./Receiver/
Command breakdown:
-
time
is used to record wall clock time, system time etc... -
cat
reads the contents of the text file and prints it to the terminal -
|
(Pipe symbol) takes the output of the command on the left (cat to_copy.txt
) and feeds it as the input to the command on the right. Essentially, it’s creating a chain of commands where the output of one becomes the input of the next. -
rsync
is the utility for synchronizing files and directories. -
-av
is the "archive" and "verbose" option.-
Archive maintain integrity like permissions, timestamps, and other attributes.
-
Verbose shows progress, which files are being transferred and any changes that are being made.
-
-
--inplace
tellsrsync
to modify the original file directly, rather than creating a new copy. -
--relative
tellsrsync
to keep the directory structure as such. -
{}
is a placeholder to substitute the name of the file being processed byrsync