Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Problem with regular expressions and tok
#1
I would like to parse lines like:
["value1", "value2", "value3",...] : goto xxx
or
"value1": goto xxx
and return the value* words.

Is tok the best way to do this?
I tried regular expressions, but I could not get them to work with an unknown number of value strings using repeating, nested sub-patterns:
Code:
Copy      Help
str subject="''pns1'', ''pns2'': goto xxx"
str pattern="(''([^'']+)''[\s,]*)+"
int i; ARRAY(str) a
if(findrx(subject pattern 0 0 a)<0) out "does not match"; ret
for i 0 a.len
,out a[i]
Produced:
"pns1", "pns2"
"pns2"
pns2
(It did not return pns1)

tok() is not behaving as I expected after reading the tok help page:
Code:
Copy      Help
; This is example 2 from the tok help page:
; It produces the output as documented
str s = "one, (two + three) four five"
ARRAY(str) arr arr2
int i nt
nt = tok(s arr 3 ", ()" 8 arr2)
for(i 0 nt) out "'%s' '%s'" arr[i] arr2[i]


; When I add a double quote inside the brackets, the behaviour is not what I expected
str s = "one, (two'' + three) four five"
ARRAY(str) arr arr2
int i nt
nt = tok(s arr 3 ", ()" 8 arr2)
for(i 0 nt) out "'%s' '%s'" arr[i] arr2[i]
Output:
'one' ', ('
'two" + three) four five' ''

Should flag=8 group all text, including double-quotes, within the brackets into one token?


Messages In This Thread

Forum Jump:


Users browsing this thread: 1 Guest(s)