Linux tips
From silico.biotoul.fr
Paths & I/O & files
Linux filesystem organization
/ | The root directory. |
/boot | Boot directory (kernel and boot loader) |
/etc | Configuration files for the system. e.g. /etc/fstab specifies which drives to mount where. /etc/hosts lists network hosts and IP addresses. |
/bin /usr/bin | The /bin directory has the essential programs that the system requires to operate, while /usr/bin contains applications for the system's users. |
/sbin /usr/sbin | The sbin directories contain programs for system administration, mostly for use by the superuser (root). |
/usr | contains things that support user applications. |
/usr/local | /usr/local and its subdirectories (bin, lib, share, ...) are used for the installation of software and other files for use on the local machine i.e., not part of the official distribution. |
/var | contains files that change as the system is running. This includes: log (logs!), spool (files that are queued for some process, such as mail messages and print jobs) |
/lib /lib64 | shared libraries (similar to DLLs of windows) |
/home | users personal directories |
/root | System administrator's home directory |
/tmp | holds temporary files (anybody/program can write) |
/dev | In linux, devices are represented by files under that directory (e.g. disks are block devices such as /dev/sda or /dev/hda usually for the 1st hard drive) |
/proc | virtual directory giving access to the running kernel and system. e.g. /proc/cpuinfo /proc/meminfo /proc/uptime |
/media /run/media /mnt | removable devices (usb sticks, usb drives, ...) are usually mounted in one of those when plugged |
File systems & partitions: fdisk, gdisk, parted, mount, lsof, mkfs
- storage devices (hard drives, CD/DVD, usb sticks, ...) are divided in partitions (at least one to be usable)
- partitions are formatted with a filesystem
- fat16, fat32, ntfs from microsoft for windows systems
- iso9660 for CDs
- ext2, ext3, ext4 (and others) for linux systems
- partition are mounted somewhere in the filesystem e.g. / or /boot or /mnt/cdrom or /home to access the files
- there is another type of partition: LVM (Logical Volume Management) which is not a filesystem but allows a partition to span on more than one drive:
- LVM partitions of possibly different drives are converted to physical volumes
- physical volumes are combined in a volume group
- logical volumes are created in a volume group and formatted with a filesytem that can then be mounted
Note: different filesystems have different capabilities. For example, FAT does not manage permissions.
The advantage of using LVM is that if a filesystem becomes too small, one dedicate more physical space and resize the logical volume.
TO DO:
- /etc/fstab
- mount
- umount
- lsof for when umount fails because some files are currently used on the filesystem
- partitioning with fdisk, gdisk, and other
- formatting with mkfs (mkfs.ext4, mkfs.iso9660, ...)
Paths & directories: pwd, mkdir, rmdir, rm
- pwd returns current directory
- relative to current directory: e.g. ls subdir/subsubdir or ls ../whatever/
- absolute ls ~user/path or ls /home/user/path
- mkdir: create directory. e.g. mkdir ~/newdir or with subdirs mkdir -p ~/new/newsub/newsubsub
- rmdir dirname or if not empty rm -fr dirname
Permissions: chown, chgrp, chmod
$ ls -l /home drwxr-x--- 69 barriot gsi 4.0K Mar 5 12:09 barriot drwx------ 2 root root 16K Jul 12 2010 lost+found drwxr-xr-x 36 micas stage 4.0K Jul 31 2012 micas ... [barriot@gamborimbo ~]$ ls -lh Documents/TEACHING/2012-2013/M1-MABS/Graph/TP3-igraph.layout/ total 80K drwxr-xr-x 1 barriot gsi 4.0K Mar 14 2012 HDE.old -rw-r--r-- 1 barriot gsi 24K Mar 14 2012 91347.nwk -rw-r--r-- 1 barriot gsi 942 Mar 1 16:02 Cleandb_Luca_1_S_1_1_65_Iso_Tr_1-CC1.cod -rw-r--r-- 1 barriot gsi 28K Sep 7 2010 Cleandb_Luca_1_S_1_1_65_Iso_Tr_1-CC1.gr -rw-r--r-- 1 barriot gsi 2.3K Sep 7 2010 Cleandb_Luca_1_S_1_1_65_Iso_Tr_1-CC1.tgr -rw-r--r-- 1 barriot gsi 4.7K Mar 5 11:42 cmds.R -rw-r--r-- 1 barriot gsi 871 Mar 14 2012 sample_tree_with_branchlengths.nwk -rwxr-xr-x 1 barriot gsi 670 Mar 14 2012 drawTree.py -rw-r--r-- 1 barriot gsi 5.6K Feb 27 16:57 Tree.py
First character corresponds to file type. d for directory, - for a regular file, ... Then by 3 for the owner (user), the group and the others.
For a regular file :
- r for permission to read
- w for permission to modify
- x for being able to execute the file (binary executable or script)
For a directory :
- r to be able to read the content (list files in the directory)
- w to be able to add or remove files
- x to be able to pass through that directory, i.e. cd to that dir or a subdir
Modify ownership of a file or directory :
# change owner chown newuser file # recursive chown -R newuser directory # change group chgrp newgroup filename # change both chown newuser.newgroup filename
Modify permissions:
# numeric notation: r=4, w=2, x=1, thus for rwx-r-x--- chmod 760 file # recursively on a sub directory chmod -R 760 dirname # symbolic notation: chmod u=rwx,g=rx,o= filename # add execute permission for all: chmod a+x filename # revoke write permission for others: chmod o-w filename
File info & type: stat, file
[barriot@gamborimbo ~]$ stat /home/barriot File: `/home/barriot' Size: 12288 Blocks: 24 IO Block: 4096 directory Device: fd02h/64770d Inode: 1048577 Links: 119 Access: (0755/drwxr-xr-x) Uid: ( 500/ barriot) Gid: ( 501/ gsi) Access: 2013-03-05 10:39:08.927051453 +0100 Modify: 2013-03-05 10:39:00.240074369 +0100 Change: 2013-03-05 10:39:00.240074369 +0100 Birth: -
[barriot@gamborimbo ~]$ stat .bashrc File: `.bashrc' Size: 517 Blocks: 8 IO Block: 4096 regular file Device: fd02h/64770d Inode: 1052239 Links: 1 Access: (0755/-rwxr-xr-x) Uid: ( 500/ barriot) Gid: ( 501/ gsi) Access: 2013-03-02 16:04:19.268619379 +0100 Modify: 2012-10-12 17:24:24.818899216 +0200 Change: 2012-11-18 23:25:18.869870338 +0100 Birth: -
[barriot@gamborimbo ~]$ file /home/barriot /home/barriot: directory
[barriot@gamborimbo ~]$ file .bashrc .bashrc: ASCII text
File content, concatenation, split, ... and redirections: cat, split, head, tail, more, less, tac
# display content cat somefile.txt # concatenate 2 or more files cat file_1.txt file_2.txt cat *.txt # redirect to a file (if file exists it will be overwritten otherwise it gets created) cat file_1.txt file_2.txt > result.txt # redirect to a file (if file exists it will be appended at the end otherwise it gets created) cat others*.txt >> result.txt # split a file into smaller parts ## by file size (1kb) split --bytes 1024 big.file split -b 1024 big.file ## by number of lines per output files split --lines 100 big.text.file.txt split -l 100 big.text.file.txt ## by number of output files split --number 10 big.file split -n 10 big.file ## specify output files prefix and numbered numerically (3 digits) split -n 100 -a 3 -d big.file part_ split -n 100 --suffix-length 3 --numeric-suffixes big.file part_ # displays line of a file in reverse order tac file.txt # first 10 lines of files head -n 10 *.txt # last 10 lines tail -n 10 *.txt # last lines of a file and keeps outputting new lines added to the file tail -f /var/log/httpd/error.log # content of a file page by page (space for next page, enter for next line) more file.txt # content of a file: page up/down to browse. /expr to search (then n for next match and p for previous match). q to exit less file.txt
grep, cut, sort, wc, find
- grep
To find files containing some string or regular expression:
grep myWeirdFunctionName *.cpp
Or recursively:
grep -r myWeirdFunctionName *
To display at what line it is found:
grep -n myWeirdFunctionName myweirdlibrary.cpp
- cut
To display only some columns: Media:Data_Mining_heart.txt
$ head Data_Mining_heart.txt age sex chest_pain_type resting_blood_pressure serum_cholesterol fasting_blood_sugar resting_ecg_results max_heart_rate exercise_induced_anginadepression_induced slope major_vessels thal disease continuous discrete discrete continuous continuous discrete discrete continuous discrete continuous continuous continuous discrete discrete class 70 M 4 130 322 FALSE 2 109 FALSE 2.4 2 3 normal TRUE 67 F 3 115 564 FALSE 2 160 FALSE 1.6 2 0 reversable_defect FALSE 57 M 2 124 261 FALSE 0 141 FALSE 0.3 1 0 reversable_defect TRUE 64 M 4 128 263 FALSE 0 105 TRUE 0.2 2 1 reversable_defect FALSE 74 F 2 120 269 FALSE 2 121 TRUE 0.2 1 1 normal FALSE 65 M 4 120 177 FALSE 0 140 FALSE 0.4 1 0 reversable_defect FALSE 56 M 3 130 256 TRUE 2 142 TRUE 0.6 2 1 fixed_defect TRUE $ head heart.txt.orange.tab | cut -f 1,4 age resting_blood_pressure continuous continuous 70 130 67 115 57 124 64 128 74 120 65 120 56 130 # sometimes we need to specify the character delimiting the columns $ tail clinical_info.csv "X86A40";12.17;;"F";"Uppsala";"F";61;"F";24;"G1" "X87A79";12.08;;"T";"Uppsala";"T";36;"T";12;"G2" "X88A67";4.25;"F";"T";"Uppsala";"T";63;"T";24;"G3" "X89A64";12.08;;"T";"Uppsala";"T";60;"T";23;"G1" "X8B87";11.33;;"T";"Uppsala";"T";58;"T";17;"G2" "X90A63";2.67;;"T";"Uppsala";"T";76;"T";26;"G3" "X94A16";11.08;;"T";"Uppsala";"T";73;"T";6;"G2" "X96A21";0.08;"F";"T";"Uppsala";"T";63;"T";38;"G3" "X99A50";10.5;;"T";"Uppsala";"T";82;"F";19;"G2" "X9B52";11.33;;"T";"Uppsala";"T";71;"T";12;"G3" $ tail clinical_info.csv | cut -f 10 -d';' "G1" "G2" "G3" "G1" "G2" "G3" "G2" "G3" "G2" "G3"
- sort, wc
# sort thal values $ head Data_Mining_heart.txt | cut -f 13 | sort discrete fixed_defect normal normal reversable_defect reversable_defect reversable_defect reversable_defect thal # remove duplicates $ cat Data_Mining_heart.txt | cut -f 13 | sort -u discrete fixed_defect normal reversable_defect thal # number of characters, words, lines $ wc *.loocv 281 1941 14174 knn.10.loocv 281 1941 14154 knn.11.loocv 281 1941 14171 knn.12.loocv 281 1941 14174 knn.13.loocv 281 1941 14150 knn.14.loocv 281 1941 14167 knn.15.loocv 281 1941 14150 knn.16.loocv 281 1941 14166 knn.17.loocv 281 1941 14172 knn.18.loocv 281 1941 14161 knn.19.loocv 281 1941 14194 knn.1.loocv 281 1941 14179 knn.20.loocv 281 1941 14153 knn.2.loocv 281 1941 14162 knn.3.loocv 281 1941 14166 knn.4.loocv 281 1941 14149 knn.5.loocv 281 1941 14164 knn.6.loocv 281 1941 14159 knn.7.loocv 281 1941 14184 knn.8.loocv 281 1941 14148 knn.9.loocv 281 1941 14210 NaiveBayes.loocv 5901 40761 297507 total # unique values of thal $ cat Data_Mining_heart.txt | cut -f 13 | sort -u | wc -l 6
- find
Find allows to filter files and dirs based on various attributes:
- name/pattern
- date/age
- size
- type
- permissions
- and others...
# find files by name recursively starting from the current subdirectory find ./ -name what*I*am*looking* # by time accessed (amin in minutes or atime in days), changed (cmin in minutes or ctime in days), modified (mtime in days) ## accessed less than 10 minutes ago find ./ -amin -10 ## changed more than 1 hour find ./ -ctime +60
- sed
To replace something (e.g. jamaica.biotoul.fr) by somethingelse (e.g. jamaica.ibcg.biotoul.fr) in a file:
sed -i 's/jamaica.biotoul.fr/jamaica.ibcg.biotoul.fr/g' gsiwikidb.after_sed.sql
Remove sequence limits from Jalview output:
sed -i 's/\/[0-9]*-[0-9]*//' CleanupFile_slimites.fa
To replace from a file to a file:
sed 's/jamaica.biotoul.fr/jamaica.ibcg.biotoul.fr/g' < gsiwikidb.before_sed.sql > gsiwikidb.after_sed.sql
To apply that to a set of files using find:
find /var/www -type f -exec sed -i 's/jamaica.biotoul.fr/jamaica.ibcg.biotoul.fr/g' {} \;
(I'm not sure about the ending \; .. it was in my bash script).
Processes
# list processes. The 1st column is the PID (process id) which can be used to send signals $ ps faux | less # top processes, hit M or P to sort by memory or CPU, q to exit # physical memory used is the RES column (RESident) $ top # a more user friendly version: htop http://htop.sourceforge.net/ # launch a program in background with & $ longtask & # you can run multiple commands in parallel: $ cmd1 & cmd2 & cmd3 $ date & ls [1] 8271 Tue Mar 5 17:15:57 CET 2013 heart.txt.orange.tab knn.13.loocv knn.17.loocv knn.20.loocv knn.5.loocv knn.9.loocv NaiveBayes.py knn.10.loocv knn.14.loocv knn.18.loocv knn.2.loocv knn.6.loocv knn.py sample-heart.tab [1]+ Done date # if you forget the & # it is possible to stop the process (in the foreground) with control+Z $ sleep 99999 ^Z [1]+ Stopped sleep 99999 $ sleep 1239999 ^Z [2]+ Stopped sleep 1239999 # list of running jobs $ jobs [1]- Stopped sleep 9999 [2]+ Stopped sleep 1239999 # put job 1 in the foreground $ fg 1 $ fg 1 sleep 9999 ^Z [1]+ Stopped sleep 9999 # put it in the background $ bg 1 [1]+ sleep 9999 & $ jobs [1]- Running sleep 9999 & [2]+ Stopped sleep 1239999 $ fg 2 ^C $ ps PID TTY TIME CMD 8342 pts/0 00:00:00 sleep 8465 pts/0 00:00:00 ps 10318 pts/0 00:00:00 bash # kill a process by its PID $ kill 8342 [1]+ Terminated sleep 9999 # by default, the kill command asks the process to stop # but sometimes the process is not listening (e.g. it is stopped) $ sleep 888888 ^Z ^Z [1]+ Stopped sleep 888888 [barriot@gamborimbo TP-Classification]$ ps PID TTY TIME CMD 8513 pts/0 00:00:00 sleep 8520 pts/0 00:00:00 ps 10318 pts/0 00:00:00 bash $ kill 8513 $ ps PID TTY TIME CMD 8513 pts/0 00:00:00 sleep 8535 pts/0 00:00:00 ps 10318 pts/0 00:00:00 bash $ jobs [1]+ Stopped sleep 888888 # nothing happened because the process is stopped and thus cannot listened and respond to our demand (until it is running again) # to kill such a process, we have to send a SIGKILL (9) instead of the default SIGTERM (15) $ kill -9 8513 [1]+ Killed sleep 888888 # this way we ask the system to kill the process instead asking to the process # list of signals $ kill -l 1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 5) SIGTRAP 6) SIGABRT 7) SIGBUS 8) SIGFPE 9) SIGKILL 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM 16) SIGSTKFLT 17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP 21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ 26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 29) SIGIO 30) SIGPWR 31) SIGSYS 34) SIGRTMIN 35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3 38) SIGRTMIN+4 39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8 43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10 55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7 58) SIGRTMAX-6 59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2 63) SIGRTMAX-1 64) SIGRTMAX # notice SIGSTOP and SIGCONT which are the same as Ctrl-Z fg or bg # it is possible to kill all process having a given name $ killall anoying_process # when you run a command in the shell, the shell is its parent process # if you log to a remote host to run a very long analysis and get disconnected # the remote shell dies and all of its children also die # to prevent this behavior, it is possible to run the command $ nohup ./long_analysis & # nohup stands for no hang up. If you forgot the nohup, it is possible to achieve the same as follows: $ sleep 999999999 ^Z [1]+ Stopped sleep 999999999 $ bg [1]+ sleep 999999999 & $ ps PID TTY TIME CMD 8878 pts/0 00:00:00 sleep 8883 pts/0 00:00:00 ps 10318 pts/0 00:00:00 bash $ disown -h 8878
bash shell
variables and aliases
# list of environment variables $ set | head AUTOJUMP_DATA_DIR=/home/barriot/.local/share/autojump AUTOJUMP_HOME=/home/barriot BASH=/usr/bin/bash BASHOPTS=checkwinsize:cmdhist:expand_aliases:extquote:force_fignore:hostcomplete:interactive_comments:progcomp:promptvars:sourcepath BASH_ALIASES=() BASH_ARGC=() BASH_ARGV=() BASH_CMDS=() BASH_LINENO=() BASH_SOURCE=() # value of a given variable (PATH is the list of directories in which executables are search when a command is issued) $ echo $PATH /usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/bin:/usr/local/bin:/home/barriot/bin:/usr/local/bin:/software/bin:/opt/bin:/home/barriot/.bin:/opt/bin:/usr/lib64/openmpi/bin # PS1 is the formating of the bash prompt. e.g. for [user@host current_dir]$ $ echo $PS1 [\u@\h \W]\$ # set or modify a variable (local to the current shell) $ MYVAR=youpi $ echo $MYVAR youpi # remove a variable $ unset MYVAR $ echo $MYVAR $ # set or modify a variable for the current shell and its future children $ export MYVAR=yopla
It is possible to customize the environment through the ~/.bashrc script which is run every time the user starts a bash shell. You can for example, add your own bin directory to the PATH variable or alias some commands you use very often:
# .bashrc # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc # the . (or source command) is like an include or import (it executes the script as if it was typed in the current shell) fi # add my own bin directories export PATH=$PATH:$HOME/.bin:/opt/bin # User specific aliases and functions alias l='ls -lh' alias ssh='ssh -Y' alias psl='ps faux | less' alias top='htop' export VISUAL=geany export EDITOR=geany export RUBYOPT=rubygems # OPENMPI (for phyml) export PATH=$PATH:/usr/lib64/openmpi/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib64/openmpi/lib
exit code of a process $?, test, if
Based on the core processing of a C program (int main), every process exits with an integer value. Usually, 0 means fine and something different means trouble. This value corresponds to the special environment variable $?.
$ date Wed Mar 6 09:56:58 CET 2013 $ echo $? 0 $ date -unrecognized_option date: invalid option -- 'n' Try `date --help' for more information. $ echo $? 1
This exit code can be used by the calling program to take appropriate actions.
First, let's focus on && and || boolean operators and the lazy evaluation of an expression:
- true AND false AND .... will evaluate as false whatever comes after the 1st false encountered (thus, no need to evaluate what's left),
- false OR false OR true OR ... will evaluate to true whatever comes after the 1st true encountered.
This lazy evaluation can be used to chain commands: with && between commands, commands will be executed until one fails, and with || between commands, they will be executed until one succeeds.
$ ls > /dev/null || echo "unable to ls" && date -unrecognized 2> /dev/null || echo "problem with date command" problem with date command
The above can be useful but sometimes too limited for some more elaborated tests. That's were the test program comes in.
$ which test /usr/bin/test
It is a common source of confusion when beginners code their 1st programs which they often name test, and when invoking their program they actually call this one. No matter what modification is made to their source code, the execution does nothing, no error, nothing... until they invoke ./test
The test program allows to evaluate expressions:
- integers and strings comparisons
- file existence, type, permissions, date
# string length (-n non zero length?, -z zero length?) $ test -n "youpi" $ echo $? 0 $ test -n "" $ echo $? 1 # string comparison $ test $HOME = "/home/barriot" && echo $? 0 $ test $HOME != "/root" && echo $? 0
See the man for other tests.
The exit code of any process (including test) can be used by an if statement:
$ date Wed Mar 6 10:43:51 CET 2013 $ date +%H 10 $ current_hour=$(date +%H) # the output of date is stored in the current_hour environment variable $ echo $current_hour 10 $ if test $current_hour -gt 12; then echo "already 12"; else echo "patience..."; fi patience... $ # often seen shortcut: $ if [ $current_hour -gt 12 ]; then echo "already 12"; else echo "patience..."; fi patience...
loops: for
for iterates through a list of values:
# from a given list $ myList='1st 2nd last' $ for i in $myList; do echo "Processing $i task"; done Processing 1st task Processing 2nd task Processing last task # combined with seq: $ seq 2 1 2 $ seq 4 6 4 5 6 $ for i in $(seq 4 6); do echo "Processing task $i"; done Processing task 4 Processing task 5 Processing task 6
Then one can elaborate, for example to apply sed to files containing a given expression:
files=$(grep -R silico * | grep -v .svn | cut -f 1 -d':') for i in $files; do sed -i 's/silico.biotoul.fr/jamaica.ibcg.biotoul.fr/g' $i; done
Archive
gzip, bzip
gzip filename will produce filename.gz
gunzip filename.gz will do the opposite.
bzip2 produces smaller files:
bzip2 file bunzip2 file.bz2
tar
tar allows to backup or restore directories and files (preserving permissions, owner, ...).
# tar some directory tree $ tar cvf myproject.tar project # c creates new archive # v is for verbose (prints out what files are archived) # f is for the name of the archive (must be followed by a filename.tar) # same with gzipped archive $ tar cvzf myproject.tar project # backup whole file system cd / tar cpjf system_backup.tar.bz2 \ --exclude=/system_backup.tar.bz2 \ --exclude=/lost+found \ --exclude=/media \ --exclude=/mnt \ --exclude=/proc \ / # c is for create # p is for preserving permissions # j is for producing compressed (bzip2) archive # f is for specifying the archive filename # list content of archive $ tar tvf archive.tar $ tar tvzf archive.tar.gz $ tar tvzf archive.tgz $ tar tvjf archive.tar.bz2 # extract whole archive $ tar xf archive.tar $ tar xzf archive.tar.gz $ tar xjpf archive.tar.bz2 # preserve permissions and owner # extract only one file $ tar xjpf system_backup.tar.bz2 etc/fstab
rsync
rsync is a great program for backups. It allows to backup/transfer only what differs or is newer, and supports backup over the network (through ssh).
Useful options:
- --archive
- recursive copy, preserves symlinks, permissions, times, owner, group, devices/specials
- --compress
- if over ssh
- --itemize-changes
- nice display of what's done
- --stats
- prints statistics on what happened
- -h
- -v
- --progress
- --dry-run
- to perform a simulation
- --delete
- to remove what's no more in the source
- -c
- skip based on checksum, not mod-time & size
examples:
- simulate a backup
rsync --dry-run --archive --itemize-changes --stats -h src_dir dest_dir
- backup a directory
rsync --archive --itemize-changes --stats -h src_dir dest_dir
- backup and remove what's no more in source dir
rsync --delete --archive --itemize-changes --stats -h src_dir dest_dir
- backup over ssh
rsync --archive --itemize-changes --stats -h -e ssh src_dir user@host:dest_dir
Network
nslookup, ping
All symbolic names are translated to IP addresses (v4 v6). IPv4 are of the form 10.0.0.1 while IPv6 are longer (to allow more machines on the network).
When contacting a host, the system needs to find its IP address. This is name resolution and is provided by a DNS (Domain Name System).
$ nslookup silico.biotoul.fr
Server: 192.168.11.1
Address: 192.168.11.1#53
Name: silico.biotoul.fr
Address: 193.48.191.15
To know if a host is accessible and powered on:
$ ping www.google.com PING www.google.com (74.125.230.242) 56(84) bytes of data. 64 bytes from par08s10-in-f18.1e100.net (74.125.230.242): icmp_req=1 ttl=51 time=19.9 ms 64 bytes from par08s10-in-f18.1e100.net (74.125.230.242): icmp_req=2 ttl=51 time=21.6 ms
However, some hosts are configured not to answer to ping requests.
wget
Another useful command is wget. It allows to retrieve files and websites from internet (http, https, http proxy, ftp).
wget www.google.com wget ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/SOFT/by_platform/GPL199/GPL166_family.soft.gz
See the man page for various usage (authenticated, whole site retrieval, follow links on a page, ...).
ssh, scp
ssh stands for secure shell and allows to connect to a host with an encrypted connection.
ssh host_or_ip ssh user@host_or_ip
Sometimes, you will want to launch graphical programs, thus the X server connection must be forwarded (the X server is what displays graphics on a linux system)
ssh -X host
or sometimes
ssh -Y host
scp allows to copy from or to a distant server:
scp mylocalfile user@host:path/newname
This can be done on a directory:
scp -r user@host:/home/barriot ./barriot_copy
Job scheduling: cron, at
Programs can be scheduled to run once (with the at command) or periodically (with the cron system).
cron allows to run commands periodically:
# list of cron job $ crontab -l no crontab for barriot [root@fidji ~]# crontab -l 0 0 * * * /root/backup_scripts/iroise_db_backup daily 0 8 * * 6 /root/backup_scripts/iroise_db_backup weekly 0 2 1 * * /root/backup_scripts/iroise_db_backup monthly
crontab syntax (from wikipedia):
* * * * * command to be executed ┬ ┬ ┬ ┬ ┬ │ │ │ │ │ │ │ │ │ │ │ │ │ │ └───── day of week (0 - 7) (0 or 7 are Sunday, or use names) │ │ │ └────────── month (1 - 12) │ │ └─────────────── day of month (1 - 31) │ └──────────────────── hour (0 - 23) └───────────────────────── min (0 - 59)
To modify the crontab:
crontab -e
This launches the default text editor (vi?) to alter the schedules. Exit the editor with saving the modifications to install the new schedule.
Other useful commands
diff & diffuse
diff display differences between 2 files:
$ diff ~/Documents/Dev/perllibs/DBConnection.pm /software/perllibs/DBConnection.pm | more 2c2 < # Version: $Id: DBConnection.pm 34 2011-03-01 09:46:00Z gsi $ --- > # Version: $Id: DBConnection.pm 45 2012-10-10 10:28:48Z gsi $ 14c14 < # CONNECTION --- > # CONNECT 17a18,20 > # DISCONNECT > $db->close; > 46a50,55 > ########### > # BONUSES # > ########### > ...
Or side by side:
$ diff -y ~/Documents/Dev/perllibs/DBConnection.pm /software/perllibs/DBConnection.pm | head package DBConnection; package DBConnection; # Version: $Id: DBConnection.pm 34 2011-03-01 09:46:00Z gsi $ | # Version: $Id: DBConnection.pm 45 2012-10-10 10:28:48Z gsi $ =head1 NAME =head1 NAME DBConnection - a helper/wrapper for MySQL (DBI) database conn DBConnection - a helper/wrapper for MySQL (DBI) database conn =head1 SYNOPSYS =head1 SYNOPSYS # USE: installed in /software/perllibs # USE: installed in /software/perllibs $ diff -y ~/Documents/Dev/perllibs/DBConnection.pm /software/perllibs/DBConnection.pm | more package DBConnection; package DBConnection; # Version: $Id: DBConnection.pm 34 2011-03-01 09:46:00Z gsi $ | # Version: $Id: DBConnection.pm 45 2012-10-10 10:28:48Z gsi $ =head1 NAME =head1 NAME DBConnection - a helper/wrapper for MySQL (DBI) database conn DBConnection - a helper/wrapper for MySQL (DBI) database conn =head1 SYNOPSYS =head1 SYNOPSYS # USE: installed in /software/perllibs # USE: installed in /software/perllibs use lib '/software/perllibs'; use lib '/software/perllibs'; use DBConnection; use DBConnection; # CONNECTION | # CONNECT my $db = DBConnection->new(host=>'localhost', db=>'cgdb', us my $db = DBConnection->new(host=>'localhost', db=>'cgdb', us $db->init || die 'Cannot connect to database'; $db->init || die 'Cannot connect to database'; > # DISCONNECT > $db->close; > # SINGLE SELECT # SINGLE SELECT
The diffuse program offers a graphical interface and allows to merge file more easily.