i not sure why doesn't work. here regex 'text\' => '.*?'
, want catch estrenos
, cine
in following nasty text using grep or sed. here tried in grep
echo "sadsa d{ 'text' => 'cine', 'indices' => [ 111, 116 ] }, { 'text' => 'estrenos', 'indices' => [ ssadw" | grep -eo "'text\' => '.*?',"
just use awk:
$ awk -v rs='}' -f\' '{print $4}' file cine estrenos
that work awk in shell on unix box. work no matter white space it'll work whether input on 1 line or spread across multiple lines , no matter how many blanks or tabs occur anywhere on each line.
here's how works:
awk treats input records separated fields. input (with spaces compressed readability):
sadsa d{ 'text' => 'cine', 'indices' => [ 111, 116 ] }, { 'text' => 'estrenos', 'indices' => [ ssadw
clearly has { ... }
records:
record 1:
{ 'text' => 'cine', 'indices' => [ 111, 116 ] }
record 2:
{ 'text' => 'estrenos', 'indices' => [ ssadw
so can set record separator }
(with -v rs='}'
). assume last record end in }
if doesn't that's fine awk treats end of file end of record. can ignore text before {
s (i.e. "sadsa d" before first record , "," between 2 records - that's treated part of first field we're not using field it's irrelevant.
so given above 2 records if split them fields @ every '
(with -f\'
) get:
$ awk -v rs='}' -f\' '{for (i=1; i<=nf;i++) print "record nr", nr, "field nr", i, "field contents: <" $i ">"; print "----" }' file record nr 1 field nr 1 field contents: <sadsa d{ > record nr 1 field nr 2 field contents: <text> record nr 1 field nr 3 field contents: < => > record nr 1 field nr 4 field contents: <cine> record nr 1 field nr 5 field contents: <, > record nr 1 field nr 6 field contents: <indices> record nr 1 field nr 7 field contents: < => [ 111, 116 ] > ---- record nr 2 field nr 1 field contents: <, { > record nr 2 field nr 2 field contents: <text> record nr 2 field nr 3 field contents: < => > record nr 2 field nr 4 field contents: <estrenos> record nr 2 field nr 5 field contents: <, > record nr 2 field nr 6 field contents: <indices> record nr 2 field nr 7 field contents: < => [ ssadw > ----
so can see value want 4th field of each record.
Comments
Post a Comment