Powershell Regex to match specific order of text and include only them in results -


i'm trying find working regex powershell in select-string commandlet looking specific text marked start of looking text , point other specific texts until last text found.

example of file text:

[begin of_header] some.text="text" some.text="text" serial=0x94pa some.text="text" some.text="text" timer=0 some.text="text" some.text="text" tag.sm=00 some.text="text" some.text="text" some.text="text" some.text="text" tag.om=00 some.text="text" some.text="text" some.text="text" tag.uc=00 some.text="text" some.text="text" some.text="text" events=pd_exf1 some.text="text" some.text="text" some.text="text" acp="my looking dynamic text" some.text="text" some.text="text" dir=6 some.text="text" some.text="text" wg=100 some.text="text" some.text="text" h=95.5 some.text="text" some.text="text"  [begin of_header] serial=0xzzz timer=0 some.text="text" some.text="text" tag.om=00 tag.uc=00 some.text="text" some.text="text" events=pd_exf1 acp="my looking dynamic text" dir=6 wg=100 h=95.5   [begin of_header] serial=0xpppp timer=0 tag.sm=00 some.text="text" some.text="text" tag.om=00 tag.uc=00 some.text="text" some.text="text" events=pd_exf1 acp="my looking dynamic text" dir=6 wg=100 h=95.5 

in case should static word [begin of_header], point start exact order match of dynamic values beginning serial= , ending acp="my looking dynamic text". , acp= can have various values + serial. if there missing value, example tag.sm=00 missing, skip searching in group , jump next [begin of_header] , start analyzing again.

the result should this:

[begin of_header] serial=0x94pa timer=0 tag.sm=00 tag.om=00 tag.uc=00 events=pd_exf1 acp="my looking dynamic text"  [begin of_header] serial=0xpppp timer=0 tag.sm=00 tag.om=00 tag.uc=00 events=pd_exf1 acp="my looking dynamic text" 

i found similar here doesn't work want.

also don't work expected because not exclude broken exact match order:

select-string -literalpath "c:\myfile.txt" -pattern "\[begin of_header\]|serial=|timer=|tag.sm=|tag.om=|tag.uc=|events=|acp=" | select-object linenumber,line 

the regular expression complex since order of elements fixed don't see problem.


$header = '[begin of_header]' $re = [regex]'(?smi)(^serial=.*?$).*(^timer=.+?$).*(^tag\.sm=.+?$).*(^tag\.om=.+?$).*(^tag\.uc=.+?$).*(^events=.+?$).*(^acp=.+?$)'  (get-content .\myfile.txt -raw) -split [regex]::escape($header)|     select-string $re | foreach-object{         $header         for($i=1;$i -lt 8;$i++){$_.matches.groups[$i].value}         ""     } 

sample output:

> q:\test\2017\09\10\so_46139332.ps1 [begin of_header] serial=0x94pa timer=0 tag.sm=00 tag.om=00 tag.uc=00 events=pd_exf1 acp="my looking dynamic text"  [begin of_header] serial=0xpppp timer=0 tag.sm=00 tag.om=00 tag.uc=00 events=pd_exf1 acp="my looking dynamic text" 

  • the header used split file contents chunks match re separately
  • (?smi) advises re use

    • s modifier: single line. dot matches newline characters

    • m modifier: multi line. causes ^ , $ match begin/end of each line (not begin/end of string)

    • i modifier: insensitive. case insensitive match
  • (^serial=.*?$).*

    • 1st capturing group (^serial=.*?$)
      ^ asserts position @ start of line
      serial= matches characters serial= literally (case insensitive)
      .*?
      . matches character *? quantifier — matches between 0 , unlimited times, few times possible, expanding needed (lazy) $ asserts position @ end of line
      .*
      matches character * quantifier — matches between 0 , unlimited times, many times possible, giving needed (greedy)

Comments