ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
New to bash scripting.. like very new. I've really only written the most basic of things echo "hello world" basic.
I'm looking for some pointers / guidance (happy to research once i know where I'm looking).
Anyway, what i would like to do is to write a script that i can copy paste an email into and then have it parse the email for d/l links based on file type.
Then
Confirm those links are correct (yes / no)
Then
WGET the files, extract them and push them to location X
it sounds fairly simple but in terms of getting it to parse the email for the links.. i've not got the first idea and google is turning into a rabbit hole.
i.e. are you looking at RFC 5322 Internet Message Format (what mail clients tend to show if you select view source), or the resolved message body only; if the latter, are you dealing with plain text, HTML, or both?
Also, what is your definition of "correct" for this context?
Generally, Greg Wooledge's BashFAQ is a good place for seeing the optimal way to perform specific common tasks, but probably doesn't get you past the first step in this instance, and as you're very new you might want to try BashGuide first.
Email will be just plain text, Literally copy and paste the whole email.
I would normally have 10-15 image links included in the email, which at the moment I've been extracting by hand and feeding into a script to unpack..I'd rather just dump the email into the script, have it scan it for links including lets .iso endings and then come back
with a list and say.. are these correct (i.e have a pulled all links and excluded anything not needed).
text="
Email will be just plain text, Literally copy and paste the whole email.
https://link1.com
I would normally have 10-15 image links included in the email, which at the
https://link2.com
moment I've been extracting by hand and feeding into a script to unpack..I'd rather just
dump the email into the script, https://link3.com have it scan it for links including
lets .iso endings and then come back
with a list and say.. are these correct https://link4.com (i.e have a pulled all links
and excluded anything not needed).
https://link5.com
https://link6.com
I then say Y and it would start the WGET process. http://link8.com
http://link7.com
http://link9.com
"
grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" <<< "$text"
https://link1.com
https://link2.com
https://link3.com
https://link4.com
https://link5.com
https://link6.com
http://link8.com
http://link7.com
http://link9.com
Quote:
Confirm those links are correct (yes / no)
Example2:
Code:
urls=(
https://link1.com
https://link2.com
https://link3.com
https://link4.com
https://link5.com
https://link6.com
http://link8.com
http://link7.com
http://link9.com
)
agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0) Gecko/20100101 Firefox/98.0"
for u in "${urls[@]}"; do
if curl -LIA "$agent" --retry 1 --max-time 1 --silent --fail "$u" -o /dev/null; then
echo ""$u" is good"
else
echo ""$u" is bad"
fi
done
https://link1.com is good:
https://link2.com is bad
https://link3.com is good:
https://link4.com is bad
https://link5.com is good:
https://link6.com is bad
http://link8.com is good:
http://link7.com is good:
http://link9.com is good:
I didn't know there was a link1.com.
Also:
Code:
for u in "${urls[@]}"; do
if wget -U "$agent" --spider --tries=1 --timeout=1 "$u" 2>/dev/null; then
echo ""$u" is good"
else
echo ""$u" is bad"
fi
done
not sure how much help you are looking for here. But maybe the following will help.
- Loop over the text broken up by whitespace.
- For each piece of text see if it starts with "http://" or "https://"
- If it does, wget it.
Try to do the above and let us know where you get stuck.
Yep, since there's no specific HTML or RFC5322 parsing to do, this is a reasonable approach that can be done using Bash IFS word splitting, looping and conditionals.
Adding a user confirmation might involve "storing urls in an array" in order to print them and prompt for Y/N before wget is called on it.
Of course, since wget accepts multiple URLs, one could instead simply do a grep for "https?://\S+" inside a Command Substitution - but that approach is less helpful with regards to learning Bash.
Here are some more ideas. zenity is a way to create dialogs for command line programs. Easier to parse if you just copy/pasted the URLs versus the entire email.
Code:
#!/bin/bash
results=$(zenity --text-info --editable\
--title="URLs" )
case $? in
1)
echo "Script Canceled"
exit
;;
-1)
echo "An unexpected error has occurred."
exit
;;
esac
# if one URL per line
for url in "$results"; do
echo "$url"
done
zenity should be available or maybe yad which works the same.
ShellCheck is a useful tool; a good idea to run it through that and solve the issues highlighted before trying to debug further.
i got it going.
i just need to get this to work now
Code:
val "$data"
wget "$(basename ${link})" "${link}" | unzip -P 12345 \*.zip | s3cmd put -P *.qcow2 s3://image-bucket
rm *
Once the download completes, it *should* then pass the images to unzip and then push them to s3. I thought i could pipe them to the next command but nope.. It just downloads and then stops
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.