LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-09-2022, 07:52 AM   #1
Skiddo
LQ Newbie
 
Registered: Jun 2022
Distribution: Ubuntu / Deepin
Posts: 6

Rep: Reputation: 0
Question BASH: Noob -Script help


Hi All,

New to bash scripting.. like very new. I've really only written the most basic of things echo "hello world" basic.

I'm looking for some pointers / guidance (happy to research once i know where I'm looking).


Anyway, what i would like to do is to write a script that i can copy paste an email into and then have it parse the email for d/l links based on file type.

Then

Confirm those links are correct (yes / no)

Then

WGET the files, extract them and push them to location X

it sounds fairly simple but in terms of getting it to parse the email for the links.. i've not got the first idea and google is turning into a rabbit hole.
 
Old 06-09-2022, 09:13 AM   #2
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,627

Rep: Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556

What format is your email in?

i.e. are you looking at RFC 5322 Internet Message Format (what mail clients tend to show if you select view source), or the resolved message body only; if the latter, are you dealing with plain text, HTML, or both?

Also, what is your definition of "correct" for this context?


Generally, Greg Wooledge's BashFAQ is a good place for seeing the optimal way to perform specific common tasks, but probably doesn't get you past the first step in this instance, and as you're very new you might want to try BashGuide first.

The Bash Reference Manual is worth keeping bookmarked too.

 
Old 06-10-2022, 02:34 AM   #3
Skiddo
LQ Newbie
 
Registered: Jun 2022
Distribution: Ubuntu / Deepin
Posts: 6

Original Poster
Rep: Reputation: 0
Email will be just plain text, Literally copy and paste the whole email.

I would normally have 10-15 image links included in the email, which at the moment I've been extracting by hand and feeding into a script to unpack..I'd rather just dump the email into the script, have it scan it for links including lets .iso endings and then come back
with a list and say.. are these correct (i.e have a pulled all links and excluded anything not needed).

I then say Y and it would start the WGET process.
 
Old 06-10-2022, 02:52 AM   #4
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,349
Blog Entries: 3

Rep: Reputation: 3766Reputation: 3766Reputation: 3766Reputation: 3766Reputation: 3766Reputation: 3766Reputation: 3766Reputation: 3766Reputation: 3766Reputation: 3766Reputation: 3766
If you're parsing text, then the scripting language you need to turn to is most likely going to be Perl.
 
Old 06-10-2022, 04:35 AM   #5
Skiddo
LQ Newbie
 
Registered: Jun 2022
Distribution: Ubuntu / Deepin
Posts: 6

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Turbocapitalist View Post
If you're parsing text, then the scripting language you need to turn to is most likely going to be Perl.
Maybe in the future, but for now i really need to get solid with BASH Scripting.
 
Old 06-10-2022, 04:45 AM   #6
evo2
LQ Guru
 
Registered: Jan 2009
Location: Japan
Distribution: Mostly Debian and CentOS
Posts: 6,724

Rep: Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705Reputation: 1705
Hi,

not sure how much help you are looking for here. But maybe the following will help.

- Loop over the text broken up by whitespace.
- For each piece of text see if it starts with "http://" or "https://"
- If it does, wget it.

Try to do the above and let us know where you get stuck.

HTH,

Evo2.
 
1 members found this post helpful.
Old 06-10-2022, 08:09 AM   #7
teckk
LQ Guru
 
Registered: Oct 2004
Distribution: Arch
Posts: 5,152
Blog Entries: 6

Rep: Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835
Quote:
Email will be just plain text,
Example:
Code:
text="
Email will be just plain text, Literally copy and paste the whole email.
https://link1.com

I would normally have 10-15 image links included in the email, which at the 
https://link2.com
moment I've been extracting by hand and feeding into a script to unpack..I'd rather just 
dump the email into the script, https://link3.com have it scan it for links including 
lets .iso endings and then come back
with a list and say.. are these correct https://link4.com (i.e have a pulled all links 
and excluded anything not needed).
https://link5.com
https://link6.com

I then say Y and it would start the WGET process. http://link8.com
http://link7.com
http://link9.com
"

grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" <<< "$text"

https://link1.com
https://link2.com
https://link3.com
https://link4.com
https://link5.com
https://link6.com
http://link8.com
http://link7.com
http://link9.com

Quote:
Confirm those links are correct (yes / no)
Example2:
Code:
urls=(
https://link1.com
https://link2.com
https://link3.com
https://link4.com
https://link5.com
https://link6.com
http://link8.com
http://link7.com
http://link9.com
)

agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0) Gecko/20100101 Firefox/98.0"

for u in "${urls[@]}"; do
    if curl -LIA "$agent" --retry 1 --max-time 1 --silent --fail "$u" -o /dev/null; then
        echo ""$u" is good"
    else
        echo ""$u" is bad"
    fi
done

https://link1.com is good: 
https://link2.com is bad
https://link3.com is good: 
https://link4.com is bad
https://link5.com is good: 
https://link6.com is bad
http://link8.com is good: 
http://link7.com is good: 
http://link9.com is good:
I didn't know there was a link1.com.

Also:
Code:
for u in "${urls[@]}"; do
    if wget -U "$agent" --spider --tries=1 --timeout=1 "$u" 2>/dev/null; then
        echo ""$u" is good"
    else
        echo ""$u" is bad"
    fi
done

Last edited by teckk; 06-10-2022 at 08:12 AM.
 
Old 06-10-2022, 08:59 AM   #8
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,627

Rep: Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556
Quote:
Originally Posted by evo2 View Post
not sure how much help you are looking for here. But maybe the following will help.

- Loop over the text broken up by whitespace.
- For each piece of text see if it starts with "http://" or "https://"
- If it does, wget it.

Try to do the above and let us know where you get stuck.
Yep, since there's no specific HTML or RFC5322 parsing to do, this is a reasonable approach that can be done using Bash IFS word splitting, looping and conditionals.

Adding a user confirmation might involve "storing urls in an array" in order to print them and prompt for Y/N before wget is called on it.

Of course, since wget accepts multiple URLs, one could instead simply do a grep for "https?://\S+" inside a Command Substitution - but that approach is less helpful with regards to learning Bash.


Last edited by boughtonp; 06-10-2022 at 09:00 AM.
 
Old 06-10-2022, 11:31 AM   #9
michaelk
Moderator
 
Registered: Aug 2002
Posts: 25,781

Rep: Reputation: 5935Reputation: 5935Reputation: 5935Reputation: 5935Reputation: 5935Reputation: 5935Reputation: 5935Reputation: 5935Reputation: 5935Reputation: 5935Reputation: 5935
Here are some more ideas. zenity is a way to create dialogs for command line programs. Easier to parse if you just copy/pasted the URLs versus the entire email.

Code:
#!/bin/bash
results=$(zenity --text-info --editable\
       --title="URLs" )
case $? in
    1)
        echo "Script Canceled"
        exit
	;;
    -1)
        echo "An unexpected error has occurred."
        exit
	;;
esac

# if one URL per line
for url in "$results"; do
  echo "$url"
done
zenity should be available or maybe yad which works the same.

Last edited by michaelk; 06-10-2022 at 11:33 AM.
 
Old 06-14-2022, 08:24 AM   #10
Skiddo
LQ Newbie
 
Registered: Jun 2022
Distribution: Ubuntu / Deepin
Posts: 6

Original Poster
Rep: Reputation: 0
Ok so some fupping about. I've come up with this.. But its failing at the last segment.

any suggestions on how to have this complete properly?

Code:
# grab image information from mail
imgs=$(awk -F': ' '
    $1 == "Download link" {link=$2}
    $1 == "Zip password" {pass=$2}
    $1 == "Zip SHA1" {sha1=$2; print link " " pass " " sha1}
    ' <<< "${text}")

# prompt for it
awk '{print $1}' <<<"${imgs}" 
read -p "Download these? (y/n)?" prompt
case "${prompt}" in
    y|Y ) echo "yes";;
    n|N ) exit 0;;
    * ) exit 1;;
esac

#mkdir -p /dir/dir1
#pushd /dir/dir1

while read link pass sha1
do
    pwd
    echo curl -sSL --fail --retry 3 -o "$(basename ${link})" "${link}"

    # check checksum
    echo "${sha1} $(basename ${link})" #| sha1sum -c -

    # download, unzip and export
    
  wget "${imgs}"
  
  unzip -P 12345 \*.zip
s3cmd put -P *.qcow2 s3://image-store/
rm *
  
done

Last edited by Skiddo; 06-14-2022 at 08:28 AM.
 
Old 06-14-2022, 11:17 AM   #11
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,627

Rep: Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556Reputation: 2556

Failing how?

ShellCheck is a useful tool; a good idea to run it through that and solve the issues highlighted before trying to debug further.

 
Old 06-15-2022, 06:34 AM   #12
Skiddo
LQ Newbie
 
Registered: Jun 2022
Distribution: Ubuntu / Deepin
Posts: 6

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by boughtonp View Post
Failing how?

ShellCheck is a useful tool; a good idea to run it through that and solve the issues highlighted before trying to debug further.

i got it going.

i just need to get this to work now

Code:
val "$data"
    wget "$(basename ${link})" "${link}" | unzip -P 12345 \*.zip | s3cmd put -P *.qcow2 s3://image-bucket
rm *
Once the download completes, it *should* then pass the images to unzip and then push them to s3. I thought i could pipe them to the next command but nope.. It just downloads and then stops

Last edited by Skiddo; 06-15-2022 at 06:55 AM.
 
Old 06-15-2022, 06:58 AM   #13
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,033

Rep: Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344
Code:
wget "$(basename ${link})" "${link}" && unzip -P 12345 \*.zip && s3cmd put -P *.qcow2 s3://image-bucket
probably
 
Old 06-15-2022, 09:06 AM   #14
Skiddo
LQ Newbie
 
Registered: Jun 2022
Distribution: Ubuntu / Deepin
Posts: 6

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by pan64 View Post
Code:
wget "$(basename ${link})" "${link}" && unzip -P 12345 \*.zip && s3cmd put -P *.qcow2 s3://image-bucket
probably
Nope.

This is the output

Code:
root@host:~# bash scripty.sh 
https://random.image.download.qcow2.zip
Download these? (y/n)?y
yes
~/export ~
--2022-06-15 13:50:36--  https://random.image.download.qcow2.zip
Resolving random.image.download.qcow2.zip (random.image.download.qcow2.zip)... failed: Name or service not known.
wget: unable to resolve host address ‘random.image.download.qcow2.zip’
--2022-06-15 13:50:36--  https://random.image.download.qcow2.zip
Resolving random.image.download.com (random.image.download.com)... 174.111.123.123
Connecting to random.image.download.com (random.image.download.com)|174.111.123.123|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8026236680 (7.5G) 
Saving to: ‘random-image-download.qcow2.zip’

random.image.download.qcow2.zip 100%[===========================================>]   7.47G  17.7MB/s    in 7m 20s  

2022-06-15 13:57:56 (17.4 MB/s) - ‘random-image-download.qcow2.zip’ saved [8026236680/8026236680]

FINISHED --2022-06-15 13:57:56--
Total wall clock time: 7m 20s
Downloaded: 1 files, 7.5G in 7m 20s (17.4 MB/s)

Last edited by Skiddo; 06-15-2022 at 09:08 AM.
 
Old 06-16-2022, 01:13 AM   #15
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,369

Rep: Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753Reputation: 2753
Personally I'd put 'set -xv' just before that line and also simplify it by putting each cmd on a separate line.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
A noob in need is a noob indeed Ryuji Yamazaki *BSD 3 10-28-2004 08:19 PM
noob with a noob question phoonerorlater Linux - Newbie 2 09-29-2004 03:43 PM
3 Noob Quetions From A Noob DaveyB Slackware 20 08-11-2004 08:00 PM
Firewall help (noob really really huge noob) ProtoformX Linux - Networking 1 03-29-2004 12:19 AM
Complete noob question from a noob noob_hampster Linux - Software 2 09-04-2003 12:03 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:06 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration