6. Startup File
Once you know how to change default settings of Wget through command
line arguments, you may wish to make some of those settings permanent.
You can do that in a convenient way by creating the Wget startup
file---`.wgetrc'.
Besides `.wgetrc' is the "main" initialization file, it is
convenient to have a special facility for storing passwords. Thus Wget
reads and interprets the contents of `$HOME/.netrc', if it finds
it. You can find `.netrc' format in your system manuals.
Wget reads `.wgetrc' upon startup, recognizing a limited set of
commands.
6.1 Wgetrc Location
When initializing, Wget will look for a global startup file,
`/usr/local/etc/wgetrc' by default (or some prefix other than
`/usr/local', if Wget was not installed there) and read commands
from there, if it exists.
Then it will look for the user's file. If the environmental variable
WGETRC
is set, Wget will try to load that file. Failing that, no
further attempts will be made.
If WGETRC
is not set, Wget will try to load `$HOME/.wgetrc'.
The fact that user's settings are loaded after the system-wide ones
means that in case of collision user's wgetrc overrides the
system-wide wgetrc (in `/usr/local/etc/wgetrc' by default).
Fascist admins, away!
6.2 Wgetrc Syntax
The syntax of a wgetrc command is simple:
The variable will also be called command. Valid
values are different for different commands.
The commands are case-insensitive and underscore-insensitive. Thus
`DIr__PrefiX' is the same as `dirprefix'. Empty lines, lines
beginning with `#' and lines containing white-space only are
discarded.
Commands that expect a comma-separated list will clear the list on an
empty command. So, if you wish to reset the rejection list specified in
global `wgetrc', you can do it with:
6.3 Wgetrc Commands
The complete set of commands is listed below. Legal values are listed
after the `='. Simple Boolean values can be set or unset using
`on' and `off' or `1' and `0'. A fancier kind of
Boolean allowed in some cases is the lockable Boolean, which may
be set to `on', `off', `always', or `never'. If an
option is set to `always' or `never', that value will be
locked in for the duration of the Wget invocation--command-line options
will not override.
Some commands take pseudo-arbitrary values. address values can be
hostnames or dotted-quad IP addresses. n can be any positive
integer, or `inf' for infinity, where appropriate. string
values can be any non-empty string.
Most of these commands have direct command-line equivalents. Also, any
wgetrc command can be specified on the command line using the
`--execute' switch (see section 2.3 Basic Startup Options.)
- accept/reject = string
- Same as `-A'/`-R' (see section 4.2 Types of Files).
- add_hostdir = on/off
- Enable/disable host-prefixed file names. `-nH' disables it.
- continue = on/off
- If set to on, force continuation of preexistent partially retrieved
files. See `-c' before setting it.
- background = on/off
- Enable/disable going to background--the same as `-b' (which
enables it).
- backup_converted = on/off
- Enable/disable saving pre-converted files with the suffix
`.orig'---the same as `-K' (which enables it).
- base = string
- Consider relative URLs in URL input files forced to be
interpreted as HTML as being relative to string---the same as
`-B'.
- bind_address = address
- Bind to address, like the `--bind-address' option.
- cache = on/off
- When set to off, disallow server-caching. See the `--no-cache'
option.
- convert_links = on/off
- Convert non-relative links locally. The same as `-k'.
- cookies = on/off
- When set to off, disallow cookies. See the `--cookies' option.
- load_cookies = file
- Load cookies from file. See `--load-cookies'.
- save_cookies = file
- Save cookies to file. See `--save-cookies'.
- connect_timeout = n
- Set the connect timeout--the same as `--connect-timeout'.
- cut_dirs = n
- Ignore n remote directory components.
- debug = on/off
- Debug mode, same as `-d'.
- delete_after = on/off
- Delete after download--the same as `--delete-after'.
- dir_prefix = string
- Top of directory tree--the same as `-P'.
- dirstruct = on/off
- Turning dirstruct on or off--the same as `-x' or `-nd',
respectively.
- dns_cache = on/off
- Turn DNS caching on/off. Since DNS caching is on by default, this
option is normally used to turn it off. Same as `--dns-cache'.
- dns_timeout = n
- Set the DNS timeout--the same as `--dns-timeout'.
- domains = string
- Same as `-D' (see section 4.1 Spanning Hosts).
- dot_bytes = n
- Specify the number of bytes "contained" in a dot, as seen throughout
the retrieval (1024 by default). You can postfix the value with
`k' or `m', representing kilobytes and megabytes,
respectively. With dot settings you can tailor the dot retrieval to
suit your needs, or you can use the predefined styles
(see section 2.5 Download Options).
- dots_in_line = n
- Specify the number of dots that will be printed in each line throughout
the retrieval (50 by default).
- dot_spacing = n
- Specify the number of dots in a single cluster (10 by default).
- exclude_directories = string
- Specify a comma-separated list of directories you wish to exclude from
download--the same as `-X' (see section 4.3 Directory-Based Limits).
- exclude_domains = string
- Same as `--exclude-domains' (see section 4.1 Spanning Hosts).
- follow_ftp = on/off
- Follow FTP links from HTML documents--the same as
`--follow-ftp'.
- follow_tags = string
- Only follow certain HTML tags when doing a recursive retrieval, just like
`--follow-tags'.
- force_html = on/off
- If set to on, force the input filename to be regarded as an HTML
document--the same as `-F'.
- ftp_proxy = string
- Use string as FTP proxy, instead of the one specified in
environment.
- glob = on/off
- Turn globbing on/off--the same as `--glob' and `--no-glob'.
- header = string
- Define an additional header, like `--header'.
- html_extension = on/off
- Add a `.html' extension to `text/html' or
`application/xhtml+xml' files without it, like
`-E'.
- http_keep_alive = on/off
- Turn the keep-alive feature on or off (defaults to on). The same as
`--http-keep-alive'.
- http_passwd = string
- Set HTTP password.
- http_proxy = string
- Use string as HTTP proxy, instead of the one specified in
environment.
- http_user = string
- Set HTTP user to string.
- ignore_length = on/off
- When set to on, ignore
Content-Length
header; the same as
`--ignore-length'.
- ignore_tags = string
- Ignore certain HTML tags when doing a recursive retrieval, just like
`--ignore-tags'.
- include_directories = string
- Specify a comma-separated list of directories you wish to follow when
downloading--the same as `-I'.
- input = string
- Read the URLs from string, like `-i'.
- kill_longer = on/off
- Consider data longer than specified in content-length header as invalid
(and retry getting it). The default behavior is to save as much data
as there is, provided there is more than or equal to the value in
Content-Length
.
- limit_rate = rate
- Limit the download speed to no more than rate bytes per second.
The same as `--limit-rate'.
- logfile = string
- Set logfile--the same as `-o'.
- login = string
- Your user name on the remote machine, for FTP. Defaults to
`anonymous'.
- mirror = on/off
- Turn mirroring on/off. The same as `-m'.
- netrc = on/off
- Turn reading netrc on or off.
- noclobber = on/off
- Same as `-nc'.
- no_parent = on/off
- Disallow retrieving outside the directory hierarchy, like
`--no-parent' (see section 4.3 Directory-Based Limits).
- no_proxy = string
- Use string as the comma-separated list of domains to avoid in
proxy loading, instead of the one specified in environment.
- output_document = string
- Set the output filename--the same as `-O'.
- page_requisites = on/off
- Download all ancillary documents necessary for a single HTML page to
display properly--the same as `-p'.
- passive_ftp = on/off/always/never
- Set passive FTP---the same as `--passive-ftp'. Some scripts
and `.pm' (Perl module) files download files using `wget
--passive-ftp'. If your firewall does not allow this, you can set
`passive_ftp = never' to override the command-line.
- passwd = string
- Set your FTP password to password. Without this setting, the
password defaults to `username@hostname.domainname'.
- post_data = string
- Use POST as the method for all HTTP requests and send string in
the request body. The same as `--post-data'.
- post_file = file
- Use POST as the method for all HTTP requests and send the contents of
file in the request body. The same as `--post-file'.
- progress = string
- Set the type of the progress indicator. Legal types are "dot" and
"bar".
- protocol_directories = on/off
- When set, use the protocol name as a directory component of local file
names. The same as `--protocol-directories'.
- proxy_user = string
- Set proxy authentication user name to string, like `--proxy-user'.
- proxy_passwd = string
- Set proxy authentication password to string, like `--proxy-passwd'.
- referer = string
- Set HTTP `Referer:' header just like `--referer'. (Note it
was the folks who wrote the HTTP spec who got the spelling of
"referrer" wrong.)
- quiet = on/off
- Quiet mode--the same as `-q'.
- quota = quota
- Specify the download quota, which is useful to put in the global
`wgetrc'. When download quota is specified, Wget will stop
retrieving after the download sum has become greater than quota. The
quota can be specified in bytes (default), kbytes `k' appended) or
mbytes (`m' appended). Thus `quota = 5m' will set the quota
to 5 mbytes. Note that the user's startup file overrides system
settings.
- read_timeout = n
- Set the read (and write) timeout--the same as `--read-timeout'.
- reclevel = n
- Recursion level--the same as `-l'.
- recursive = on/off
- Recursive on/off--the same as `-r'.
- relative_only = on/off
- Follow only relative links--the same as `-L' (see section 4.4 Relative Links).
- remove_listing = on/off
- If set to on, remove FTP listings downloaded by Wget. Setting it
to off is the same as `--no-remove-listing'.
- restrict_file_names = unix/windows
- Restrict the file names generated by Wget from URLs. See
`--restrict-file-names' for a more detailed description.
- retr_symlinks = on/off
- When set to on, retrieve symbolic links as if they were plain files; the
same as `--retr-symlinks'.
- robots = on/off
- Specify whether the norobots convention is respected by Wget, "on" by
default. This switch controls both the `/robots.txt' and the
`nofollow' aspect of the spec. See section 9.1 Robot Exclusion, for more
details about this. Be sure you know what you are doing before turning
this off.
- server_response = on/off
- Choose whether or not to print the HTTP and FTP server
responses--the same as `-S'.
- span_hosts = on/off
- Same as `-H'.
- strict_comments = on/off
- Same as `--strict-comments'.
- timeout = n
- Set timeout value--the same as `-T'.
- timestamping = on/off
- Turn timestamping on/off. The same as `-N' (see section 5. Time-Stamping).
- tries = n
- Set number of retries per URL---the same as `-t'.
- use_proxy = on/off
- Turn proxy support on/off. The same as `-Y'.
- verbose = on/off
- Turn verbose on/off--the same as `-v'/`-nv'.
- wait = n
- Wait n seconds between retrievals--the same as `-w'.
- waitretry = n
- Wait up to n seconds between retries of failed retrievals
only--the same as `--waitretry'. Note that this is turned on by
default in the global `wgetrc'.
- randomwait = on/off
- Turn random between-request wait times on or off. The same as
`--random-wait'.
6.4 Sample Wgetrc
This is the sample initialization file, as given in the distribution.
It is divided in two section--one for global usage (suitable for global
startup file), and one for local usage (suitable for
`$HOME/.wgetrc'). Be careful about the things you change.
Note that almost all the lines are commented out. For a command to have
any effect, you must remove the `#' character at the beginning of
its line.
| ###
### Sample Wget initialization file .wgetrc
###
## You can use this file to change the default behaviour of wget or to
## avoid having to type many many command-line options. This file does
## not contain a comprehensive list of commands -- look at the manual
## to find out what you can put into this file.
##
## Wget initialization file can reside in /usr/local/etc/wgetrc
## (global, for all users) or $HOME/.wgetrc (for a single user).
##
## To use the settings in this file, you will have to uncomment them,
## as well as change them, in most cases, as the values on the
## commented-out lines are the default values (e.g. "off").
##
## Global settings (useful for setting up in /usr/local/etc/wgetrc).
## Think well before you change them, since they may reduce wget's
## functionality, and make it behave contrary to the documentation:
##
# You can set retrieve quota for beginners by specifying a value
# optionally followed by 'K' (kilobytes) or 'M' (megabytes). The
# default quota is unlimited.
#quota = inf
# You can lower (or raise) the default number of retries when
# downloading a file (default is 20).
#tries = 20
# Lowering the maximum depth of the recursive retrieval is handy to
# prevent newbies from going too "deep" when they unwittingly start
# the recursive retrieval. The default is 5.
#reclevel = 5
# Many sites are behind firewalls that do not allow initiation of
# connections from the outside. On these sites you have to use the
# `passive' feature of FTP. If you are not behind such a firewall,
# you can turn this off to make Wget use active FTP by default.
passive_ftp = on
# The "wait" command below makes Wget wait between every connection.
# If, instead, you want Wget to wait only between retries of failed
# downloads, set waitretry to maximum number of seconds to wait (Wget
# will use "linear backoff", waiting 1 second after the first failure
# on a file, 2 seconds after the second failure, etc. up to this max).
waitretry = 10
##
## Local settings (for a user to set in his $HOME/.wgetrc). It is
## *highly* undesirable to put these settings in the global file, since
## they are potentially dangerous to "normal" users.
##
## Even when setting up your own ~/.wgetrc, you should know what you
## are doing before doing so.
##
# Set this to on to use timestamping by default:
#timestamping = off
# It is a good idea to make Wget send your email address in a `From:'
# header with your request (so that server administrators can contact
# you in case of errors). Wget does *not* send `From:' by default.
#header = From: Your Name <username@site.domain>
# You can set up other headers, like Accept-Language. Accept-Language
# is *not* sent by default.
#header = Accept-Language: en
# You can set the default proxies for Wget to use for http and ftp.
# They will override the value in the environment.
#http_proxy = http://proxy.yoyodyne.com:18023/
#ftp_proxy = http://proxy.yoyodyne.com:18023/
# If you do not want to use proxy at all, set this to off.
#use_proxy = on
# You can customize the retrieval outlook. Valid options are default,
# binary, mega and micro.
#dot_style = default
# Setting this to off makes Wget not download /robots.txt. Be sure to
# know *exactly* what /robots.txt is and how it is used before changing
# the default!
#robots = on
# It can be useful to make Wget wait between connections. Set this to
# the number of seconds you want Wget to wait.
#wait = 0
# You can force creating directory structure, even if a single is being
# retrieved, by setting this to on.
#dirstruct = off
# You can turn on recursive retrieving by default (don't do this if
# you are not sure you know what it means) by setting this to on.
#recursive = off
# To always back up file X as X.orig before converting its links (due
# to -k / --convert-links / convert_links = on having been specified),
# set this variable to on:
#backup_converted = off
# To have Wget follow FTP links from HTML files by default, set this
# to on:
#follow_ftp = off
|
This document was generated
by Autobuild on April, 7 2005
using texi2html