ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Now , when i put my script searching the definition log , i will use the variable i have in the log , witch is :
"/index.php/module/action/param1 ${@die(md5(HelloThinkPHP))}"
and i want the script to identify that this line belongs to "/module/action/param1 ${@die(md5(HelloThinkPHP))}" , and then i will retrieve with awk the variable $1 witch is ThinkPHP_RCE.
Can you go into a little more detail and give one or two more examples? It sounds like you want to read patterns in from one file and search for them in a second. If that is the case, you might have to escalate to perl to avoid lots of loops in AWK.
bash-5.0$ awk 'FILENAME=="def.conf" {a[i]=$1;b[i]=$2;i++}; FILENAME!="def.conf" {for(i in b) {if(match($2,b[i])>0) {print a[i]; break} else {if(i==length(b)-1) {print "No match"}}}}' def.conf def.log
cc
aa
bb
No match
aa
bb
This reads the def.conf file into two arrays, then processes the def.log file. The use of length() is a gawk extension.
Thanks both of you , Allend is almost there , the problem is i can not rely on 2 last characters found , because it is not enough and a lot of false positives will appear .
One of the difficulties here is that is have more text in the variable than on the file that will provid me the output i want .
and i send the script to search the definition file above with this variable :
blalbalb/rttrh/456430/ewrewr/88000
then i am stuck because nothing will be found .
Another alternative would be the inverse , witch means picking line by line on definitions file and search on the log , this way will work because the variable will be small :
if i search for :
rttrh/456430/ewrewr/88000
in
blalbalb/rttrh/456430/ewrewr/88000
then i will have a positive output , but will waste a lot of resources and time to do it line by line .
Now , one this that will do the job will be removing the text untile first slash , and then search , if nothing found then remove the text until next front slash .
This way will work , but eventually i will do a lot of searches with not result that will increase time to the script .
Thanks both of you , Allend is almost there , the problem is i can not rely on 2 last characters found , because it is not enough and a lot of false positives will appear .
What is missing for us is: Precisely how much of the matched string can you rely on?
Can you provide a clear example of a single match pattern from the def file, along with a few lines which should match, and a few which should not match. I have tried to see that from your examples already given but without success.
allend , look , i didnt move from my original post , i just give another example .
Quote:
Now , when i put my script searching the definition log , i will use the variable i have in the log , witch is :
"/index.php/module/action/param1 ${@die(md5(HelloThinkPHP))}"
Quote:
Now in web server log this line could be in many forms but that specific sequence is there , by this i mean :
and i send the script to search the definition file above with this variable :
blalbalb/rttrh/456430/ewrewr/88000
then i am stuck because nothing will be found .
Definitions is some file where i will store all the variables to be compared with .
The ip address on first post was just an example , of course that i will not send the ip address to grep , i will send only what i need to search .
i can leave out "/index.php" because that name file changes , so i will rely only on
Quote:
/module/action/param1 ${@die(md5(HelloThinkPHP))}
Now what i need is not how to put the 1st line as the second line "by removing everything until the 2nd front slash .
What i need is the fastest way to look into a big file for that combination .
I usually use grep , but for heavy files maybe it would be interesting to use something a little more faster .
However i have here an issue , the problem is that every line is different , and this can not be applied for 1 single case .
i have lines in log like this :
/HNAP1
with is an information disclosure to dlink routers (i believe) , on this case i can not remove until the 1st front slash .
Thinking a little bit better , what i really need is to see if on the beginning of the variable is a file or a directory .
That still does not specify how much you can rely on very precisely.
Question: Can you rely on there always being a string matching /module/action/param1 in every line you want to search for?
Quote:
Originally Posted by pedropt
However i have here an issue , the problem is that every line is different , and this can not be applied for 1 single case .
i have lines in log like this :
/HNAP1
But you have not said what you want to do in these cases. Ignore the line? Search for the line? What?
UPDATE: Think in terms of your thread title:
Quote:
Search text if some part sequence exists
Define for us exactly the part sequence which exists and is used to trigger the search.
Last edited by astrogeek; 10-09-2019 at 04:45 PM.
Reason: Updated
Now , the definition file contains a sequence that it could be equal or not to what i have in log , this will be this way because by default hackers use automated scripts with potential directories , these scripts they use run a list of potential directories .
For me this means that i dont need to write in definitions file every line , i just need to write one line that i will know that they will use for sure , this way i can identify the technique used .
On the above QUOTE ; there are multiple exploitations they have try , but before start digging the definitions file for what they were after , my script 1st must identify what kind of request was made to the server .
From the above Quote what script must search :
Line 1 = /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/
Line 2 = /HNAP1/
Line 3 = /prov/aastra.cfg
Line 4 = /f4bb336d/
Line 5 = Ignore
Line 6 = /module/action/param1/${@die(md5(HelloThinkPHP))}
Line 7 = /App/?content=die(md5(HelloThinkPHP))
Line 8 = /editBlackAndWhiteList
Line 9 = /0015650000000.cfg
How it should do in code :
if last text of variable is a file , and is a .php then remove that text file and search .
If it does not have any file in the beginning or end then search (Line 2)
if after a directory a file text exists but it is not php then search without removing anything .
if it starts with a filename other than php then search all .
Resuming :
- a)Detect if "anything.php" exists in the beginning or at the end of variable and remove it .
- b) Case a) code is true then execute it and search .
- Case a code is false then search
Now what is more important in the code is a fast search .
Looks a lot like you are reinventing modsecurity...
Your line 5 case seems at odds with your rule "anything.php = remove and search" as stated. How would the script know to ignore it?
What do you want to get as the final output? The lines from the log or simply a count of the matching lines?
How do you intend to use this? Near real time as lines are added to the logs? Once per day/week to extract stats? For reporting purposes or blocking purposes?
There is a lot of relevant info we do not have.
At the very least I think that you have the problem, as stated so far, backwards - instead of searching the logs, mangling the lines then searching the definitions for a match with the mangle, simply search the logs for matches to the second part of the definitions one definition at a time, replace matches, skip others.
That said, I don't think your problem is yet well enough defined as indicated by the line 5 mismatch, and I would suggest looking at a rule set for modsecurity to see what is actually involved in matching common exploits by regular expression.
and server.conf (escaping the characters used in creating regular expressions)
Quote:
aa /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/
bb /HNAP1/
cc /prov/aastra.cfg
dd /f4bb336d/
ee /module/action/param1/\${@die\(md5\(HelloThinkPHP\)\)}
ff /App/\?content=die\(md5\(HelloThinkPHP\)\)
gg /editBlackAndWhiteList
hh /0015650000000.cfg
and server.awk
Code:
FILENAME=="server.conf" {a[i]=$1;b[i]=$2;i++};
FILENAME!="server.conf" {
for(i in b) {
if(match($6,b[i])>0) {
print a[i] " Found " b[i] " in " $0;
break}
};
}
then
Code:
bash-5.0$ awk -f server.awk server.conf server.log
aa Found /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/ in xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/index.php
bb Found /HNAP1/ in xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /HNAP1/
cc Found /prov/aastra.cfg in xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /prov/aastra.cfg
dd Found /f4bb336d/ in xxx.xxx.xxx.xxx - [09/Oct +0100] "POST /f4bb336d/admin.php
ee Found /module/action/param1/\${@die\(md5\(HelloThinkPHP\)\)} in xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /index.php/module/action/param1/${@die(md5(HelloThinkPHP))}
ff Found /App/\?content=die\(md5\(HelloThinkPHP\)\) in xxx.xxx.xxx.xxx - [09/Oct +0100] "GET /App/?content=die(md5(HelloThinkPHP))
gg Found /editBlackAndWhiteList in xxx.xxx.xxx.xxx - [09/Oct +0100] "POST /editBlackAndWhiteList
hh Found /0015650000000.cfg in xxx.xxx.xxx.xxx - [08/Oct +0100] "GET /0015650000000.cfg
This will be done ip by ip , this means that i will choose firstly the ip , then from that your code will start to identify what was that ip doing in server , after your code i will add some other code that in case nothing was found in definitions file (server.conf) , then will ask me to add a new line to definitions file for future detection .
Great code indeed , i was not expecting it was so simple to do it .
I did not yet marked this thread as solved because i am having difficulties to export the code
inside to the script without having an additional instruction file "server.awk" .
Code:
FILENAME=="server.conf" {a[i]=$1;b[i]=$2;i++};
FILENAME!="server.conf" {
for(i in b) {
if(match($6,b[i])>0) {
print a[i] " Found " b[i] " in " $0;
break}
};
}
Exporting it normally to a script how it should be ?
Example :
var1 = some strings to be matched in server.conf
if the string is matches then stop , else continue checking other lines .
What i am doing here is :
After i select an ip to be checked in webservers log i firstly will grab all the log data from that ip to a temp file , and then i will need this code to compare the ip requests in temp.tmp file with the definitions file "server.conf" , the loop will read the first line of the temp file and will search in the definitions file "server.conf" if it matches , in case something was found then stop there and bring back the results .
Something like this :
After exporting the ip data to a tempfile
Code:
ipval="somecode before where i will retrieve the ip to be checked"
This will count the lines to be checked from that ip in temp file
cntip=$(wc -l temp.tmp | awk '{print$1}'
for i in (seq $cntip)
do
var1=$(sed -n ${i}p temp.tmp)
var2=$(awk instruction without the loop in previous code and retrieve $1 from server.conf in case matches any line in $2 in server.conf)
if [[ ! -z "$var2" ]]
then
#stop the loop
cntip="$i"
echo "$ipval activity in server was $var2"
fi
done
Note : temp.tmp file will be a cleaned file with only the requests that ip made , no more data will be there .
An example of temp.tmp file would be this :
#!/bin/bash
Patterns=( $( cat server.conf ) )
# or
#while read P
#do
# Patterns+=($P)
#done < server.conf
while read logline
do
for i in "${Patterns[@]}"
do
[[ ${logline} =~ $i ]] \
&& printf "Pattern %s matches \"%s\"\n" "${i}" "${logline}" \
&& break
done
done < Server.log
The patterns in server.conf need to be patterns , at the moment they are not.
how does IP relate?
Code:
#
# create Patterns array as above
#
while read -a ip
do
# TODO add some condition here
while read logline
do
# TODO add some condition here
for i in "${Patterns[@]}"
do
[[ ${logline} =~ $i ]] \
&& printf "Pattern %s matches \"%s\"\n" "${i}" "${logline}" \
&& break
# TODO add real actions here
done
done< <(grep "${ip[10]%:*}" someotherlog.log)
done < somelog.log
FILENAME=="server.conf" {a[i]=$1;b[i]=$2;i++};
FILENAME!="server.conf" && $1==IP {
for(i in b) {
if(match($6,b[i])>0) {
print a[i] " Found " b[i] " in " $0;
break}
};
}
then
Code:
bash-5.0$ awk -f server.awk -v IP="111.111.111.111" server.conf server.log
aa Found /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/ in 111.111.111.111 - [09/Oct +0100] "GET /wp-content/plugins/portable-phpmyadmin/wp-pma-mod/index.php
cc Found /prov/aastra.cfg in 111.111.111.111 - [09/Oct +0100] "GET /prov/aastra.cfg
ee Found /module/action/param1/\${@die\(md5\(HelloThinkPHP\)\)} in 111.111.111.111 - [09/Oct +0100] "GET /index.php/module/action/param1/${@die(md5(HelloThinkPHP))}
gg Found /editBlackAndWhiteList in 111.111.111.111 - [09/Oct +0100] "POST /editBlackAndWhiteList
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.