bash - converting regex to sed or grep regex -


i not sure why doesn't work. here regex 'text\' => '.*?' , want catch estrenos , cine in following nasty text using grep or sed. here tried in grep

echo "sadsa d{                             'text' => 'cine',                             'indices' => [                                            111,                                            116                                          ]                           },                           {                             'text' => 'estrenos',                             'indices' => [ ssadw" | grep -eo "'text\' => '.*?'," 

just use awk:

$ awk -v rs='}' -f\' '{print $4}' file cine estrenos 

that work awk in shell on unix box. work no matter white space it'll work whether input on 1 line or spread across multiple lines , no matter how many blanks or tabs occur anywhere on each line.

here's how works:

awk treats input records separated fields. input (with spaces compressed readability):

sadsa d{ 'text' => 'cine', 'indices' => [ 111, 116 ] }, { 'text' => 'estrenos', 'indices' => [ ssadw 

clearly has { ... } records:

record 1:

{ 'text' => 'cine', 'indices' => [ 111, 116 ] } 

record 2:

{ 'text' => 'estrenos', 'indices' => [ ssadw 

so can set record separator } (with -v rs='}'). assume last record end in } if doesn't that's fine awk treats end of file end of record. can ignore text before {s (i.e. "sadsa d" before first record , "," between 2 records - that's treated part of first field we're not using field it's irrelevant.

so given above 2 records if split them fields @ every ' (with -f\') get:

$ awk -v rs='}' -f\' '{for (i=1; i<=nf;i++) print "record nr", nr, "field nr", i, "field contents: <" $i ">"; print "----" }' file record nr 1 field nr 1 field contents: <sadsa d{ > record nr 1 field nr 2 field contents: <text> record nr 1 field nr 3 field contents: < => > record nr 1 field nr 4 field contents: <cine> record nr 1 field nr 5 field contents: <, > record nr 1 field nr 6 field contents: <indices> record nr 1 field nr 7 field contents: < => [ 111, 116 ] > ---- record nr 2 field nr 1 field contents: <, { > record nr 2 field nr 2 field contents: <text> record nr 2 field nr 3 field contents: < => > record nr 2 field nr 4 field contents: <estrenos> record nr 2 field nr 5 field contents: <, > record nr 2 field nr 6 field contents: <indices> record nr 2 field nr 7 field contents: < => [ ssadw > ---- 

so can see value want 4th field of each record.


Comments