Imagine that you need to borrow a hosted CSS file, along with its resources. You paste the CSS contents in a text editor, and start searching for url()
patterns within it. After seeing 100+ matches, you bless the name of the CSS sprite-oblivious person who built it.
That could be one's nightmare of a working day, hopefully not a reality. Don't worry! wget is here to save the day!
Downloading for offline use, then pruning the result
Turns out, wget
has a very handy -p
switch that lets you download an offline copy of a page. So if a page references the said CSS file,
wget -p -k http://example.com/
... will get it, and you can then get the CSS and files from the result.
Download only what's needed
Now, getting the HTML page and its cousins is not viable if it has a lot of content, or if you don't know of such a page.
Instead, you can automate the necessary steps to achieve the same result:
- Download the CSS file
- Get all
url()
references from it - Download each of the above, relative to the CSS
Here's the command, with inline comments
# get the CSS file
wget http://example.com/css/styles.css
# below script assumes quotes in url() declarations
# find all URLs within it
grep -ri "url('[^']*')" -o styles.css | \
# remove the start and end of the URL css
sed "s/url('//" | sed "s/\(?\|'\).*//" | \
# no need to download anything twice
uniq | \
# pipe all URLs to wget, using the currect base URL
wget -i - -B http://example.com/css/
Now you'll have time for something else. For example, learn how to solve the Rubik's cube. This one is really good.