Accessing nested block data
Started by dragoncity on 14-Nov-2013/7:12:34-8:00
dragoncity — 14-Nov-2013/7:12:34-8:00
I'm trying ot process a file of LaTeX document index references in Rebol.
The input file consists of records like these:
\indexentry{loop}{22}
\indexentry{gui}{32}
... etc ...
I extract the required text parts and put each entry into a block :
indexentry: [ ["loop" ["22"]] ["gui" ["32"] .... ]
however, I need to find any duplicate entries as I process the input file
and simply update the inner block of page numbers.
>> do %scantexindex.r3
Script: "Edit LaTeX .idx file" Version: none Date: 12-Nov-2013
14-Nov-2013/18:10:17+11:00
++:whileTrue: 32
Reference: whileTrue: PageNo: 32
++:Stream~File 33
Reference: Stream~File PageNo: 33
++:FileStream 33
Reference: FileStream PageNo: 33
++:nextPut: 33
Reference: nextPut: PageNo: 33
++:FileStream 38
Reference: FileStream PageNo: 38
++:Stream~File 201
Reference: Stream~File PageNo: 201
I produce this :
["ReadStream~whileTrue:" ["32"]]
["Stream~File" ["33"]]
["FileStream" ["33"]]
["nextPut:" ["33"]]
["FileStream" ["38"]]
["Stream~File" ["201"]]
But this is what I want to achieve :
["ReadStream~whileTrue:" ["32"]]
["Stream~File" ["33" "201" ]]
["FileStream" ["33" "38"]]
["nextPut:" ["33"]]
============================================================
I can do this manually ( in a console )
>> t: ["loop" ["22"]]
== ["loop" ["22"]]
>> pick t/2 1
== "22"
>> append/only t/2 "77"
== ["22" "77"]
>> t
== ["loop" ["22" "77"]]
===========================================================
But I can't figure out how to access the separate fields in the
found entry when inside the foreach loop.
eg:
foreach [ entry ] indexlist
[ if find entry "FileStream" [ prin "***** Found "
----;; here, I've found a entry , but can't uodate it !!!
--------
]
-]
===================== complete program ============================
R E B O L [Title: "Edit LaTeX .idx file" Date: 12/11/2013 ]
; to use : ./rebol scantexindx.r3 > xxxx.txt
; then : move xxxx.txt to latex directory and rename as required
; eg:
; \indexentry{loop}{22} ---> loop 22
infile: read/lines/string %/home/brett/Saphir/smalltalk.idx
; -- short list
indexlist: copy [] ; create empty list
print now
foreach line infile [
prin cr print line
replace/all line "\indexentry" ""
replace/all line "{" ""
replace/all line "}" " "
trim line ; remove any leading/trailing spaces
prin "++: " print reduce [ line ]
pos: find line space
ref: copy/part line pos
rest: copy next pos
prin "Reference: " prin ref prin tab prin" PageNo: " print rest
append/only indexlist remold [ ref reduce [ rest ] ]
] ; end major loop
print "" print indexlist
print space print "***** search "
foreach [ entry ] indexlist
[ if find entry "FileStream" [
prin "***** Found "
t: entry
------ print t
]
-]
==================== end program ==================================
***** search
***** Found ["FileStream" ["33"]]
***** Found ["FileStream" ["38"]]
== none
There is most likely a very obvious way to do this !! ??
Endo — 14-Nov-2013/8:46:04-8:00
Hi,
I assume that you already prepare the first block, so the rest is here:
R E B O L []
;input values
b: [
-["ReadStream~whileTrue:" ["32"]]
-["Stream~File" ["33"]]
-["FileStream" ["33"]]
-["nextPut:" ["33"]]
-["FileStream" ["38"]]
-["Stream~File" ["201"]]
]
;prepare a block
c: copy []
foreach v b [append c reduce [v/1 copy []]]
c: unique/skip c 2
;fill the block with necessary values
foreach v b [
-if p: find/tail c v/1 [
--append p/1 v/2/1
-]
]
? c
halt
Endo — 14-Nov-2013/8:46:44-8:00
The result is:
C is a block of value: ["ReadStream~whileTrue:" ["32"] "Stream~File" ["33" "201"] "FileStream" ["33" "38"] "nextPut:" ["33"]]
Nick — 14-Nov-2013/10:00:03-8:00
Hehe, Endo beat me to it, and as always with a slick Rebolish answer. Here's my simple solution, including a loop to build the initial data block you presented (labeled 'foundlist here):
R E B O L []
foundlist: copy []
foreach [entry] indexlist [
if find entry "FileStream" [
print rejoin ["***** Found " entry]
append foundlist entry
]
]
keys: copy []
foreach line foundlist [append keys line/1]
keys: unique keys
finallist: copy []
foreach key keys [
final: copy reduce [key copy []]
foreach found foundlist [
if found/1 = key [append final/2 found/2]
]
append finallist final
]
probe finallist
halt
dragoncity — 15-Nov-2013/1:16:32-8:00
Thanks for the responses, unfortunately, neither actually worked
using my data :-)
I was somewhat amused that I needed to actually transfer block data
into new blocks
as I expected to be able to work on the data 'in place' as it were.
ie: having found an item, update "here"
( But that's OK :-)
The results from Nicks code was so weird , ie lots of empty values, eg
["" "" .... etc that I chose to work on Endo's example for the moment.
Notice how different the data is at the "## X" & "## b" tags, below at
"results of run",
the ##X is created by my code reading the input file, the ##b is Endo's
internally defined block, containing very similar textual data, but
they are actually quite different data formats !
So of course when I tell the code to use my data it fails, how do I
reformat my data into the internal block format so Endo's code will work ?
========================================================================
R E B O L [Title: "Edit LaTeX .idx file - Endo version" Date: 12/11/2013 ]
; to use : ./rebol scantexindx.r3 > xxxx.txt
; then : move xxxx.txt to latex directory and rename as required
; eg:
; \indexentry{loop}{22} ---> loop 22
infile: read/lines/string %/home/brett/Saphir/smalltalk.idx
; -- short list
indexlist: copy [] ; create empty list
print now
foreach line infile [
prin cr print line
replace/all line "\indexentry" ""
replace/all line "{" ""
replace/all line "}" " "
trim line ; remove any leading/trailing spaces
; prin "++: " print reduce [ line ]
pos: find line space
ref: copy/part line pos
rest: copy next pos
; prin "Reference: " prin ref prin tab prin" PageNo: " print rest
;; append/only indexlist [ ref reduce [ rest ] ]
append/only indexlist remold [ ref reduce [ rest ] ] ; original
] ; end major loop
prin " ## X " print indexlist
;input values set up by "endo" ( a forum person :-)
b: [
["ReadStream~whileTrue:" ["32"]]
["Stream~File" ["33"]]
["FileStream" ["33"]]
["nextPut:" ["33"]]
["FileStream" ["38"]]
["Stream~File" ["201"]]
]
; input value set up by my program ( brett )
;;; b: copy reform [ indexlist ]
prin " ## b " print b
;prepare a block
c: copy []
foreach v b [append c reduce [v/1 copy []]]
c: unique/skip c 2
;fill the block with necessary values
foreach v b [
if p: find/tail c v/1 [
append p/1 v/2/1
]
]
? c
================= results of run ========================
>> do %scantexindex-endo.r3
Script: "Edit LaTeX .idx file - Endo version" Version: none Date: 12-Nov-2013
15-Nov-2013/17:12:17+11:00
\indexentry{whileTrue:}{32}
\indexentry{Stream~File}{33}
\indexentry{FileStream}{33}
\indexentry{nextPut:}{33}
\indexentry{FileStream}{38}
\indexentry{Stream~File}{201}
## X ["whileTrue:" ["32"]] ["Stream~File" ["33"]] ["FileStream" ["33"]] ["nextPut:" ["33"]] ["FileStream" ["38"]] ["Stream~File" ["201"]]
## b ReadStream~whileTrue: 32 Stream~File 33 FileStream 33 nextPut: 33 FileStream 38 Stream~File 201
C is a block of value: ["ReadStream~whileTrue:" ["32"] "Stream~File" ["33" "201"] "FileStream" ["33" "38"] "nextPut:" ["33"] "FileStream" [] "Stream~File" []]
>>
=================== my example data file ( smalltalk.idx )===============
\indexentry{whileTrue:}{32}
\indexentry{Stream~File}{33}
\indexentry{FileStream}{33}
\indexentry{nextPut:}{33}
\indexentry{FileStream}{38}
\indexentry{Stream~File}{201}
==========================================
dragoncity — 15-Nov-2013/6:14:49-8:00
Nailed it! Figured out how to match the file created block to what Endo's code needed.
Was the result of lots of trial & error, not exactly straight forward !!!
The double REDUCE of the original date formatting
solved the problem.
===============================
append/only indexlist reduce [ ref reduce [ rest ] ]
; create block 'like' internally defined block without enclosing quotes
;eg: whileTrue: 32 Stream~File 33 | NOT | "whileTrue:" "32" "Stream~File" "33"
============================================
I'm not doubting the value of Rebol , but its data handling is a bit weird :-)
Thanks again for your help.
Nick — 15-Nov-2013/8:13:08-8:00
I probably wasn't clear enough about the first part of my code. It *creates* the data block that Endo provided for you:
R E B O L []
foundlist: copy []
foreach [entry] indexlist [
if find entry "FileStream" [
print rejoin ["***** Found " entry]
append foundlist entry
]
]
This part of the code does the exact same thing as Endo's example. My block labeled 'foundlist is the same as his block labeled 'b - the code above creates that block, from the raw data example that you provided. That's probably why you saw a bunch of empty values - the foundlist block was empty if you didn't provide it your initial raw dataset:
R E B O L []
foundlist: [
["ReadStream~whileTrue:" ["32"]]
["Stream~File" ["33"]]
["FileStream" ["33"]]
["nextPut:" ["33"]]
["FileStream" ["38"]]
["Stream~File" ["201"]]
]
keys: copy []
foreach line foundlist [append keys line/1]
keys: unique keys
finallist: copy []
foreach key keys [
final: copy reduce [key copy []]
foreach found foundlist [
if found/1 = key [append final/2 found/2]
]
append finallist final
]
probe finallist
halt
Nick — 15-Nov-2013/8:21:54-8:00
PS - I used your example with the 'indexlist lable to create the first part of the code above, so that you could just drop it into your existing code (I tried to do a bit more automation for you than in Endo's example). If your data formatting changed from the example you gave, of course, it won't work properly:
foreach [ entry ] indexlist
[ if find entry "FileStream" [
prin "***** Found "
t: entry
print t
]
]
Nick — 15-Nov-2013/12:44:03-8:00
BTW, it looks like your doing a <i>lot</i> of extra effort converting your original .idx file, where you should just use parse. The following script does the entire process for you:
R E B O L []
foundlist: copy []
foreach line read/lines %smalltalk.idx [append/only foundlist parse line "{}"]
keys: copy []
foreach line foundlist [append keys line/2]
keys: unique keys
finallist: copy []
foreach key keys [
final: copy reduce [key copy []]
foreach found foundlist [
if found/2 = key [append final/2 found/4]
]
append finallist final
]
probe finallist
halt
Using shorter variable labels like Endo's:
R E B O L []
b: copy [] foreach l read/lines %smalltalk.idx [append/only b parse l "{}"]
k: copy [] foreach l b [append k l/2] k: unique k
f: copy [] foreach v k [
n: copy reduce [v copy []]
foreach d b [if d/2 = v [append n/2 d/4]] append f n
]
editor f
Nick — 15-Nov-2013/12:59:48-8:00
And here's your <i>entire</i> program using Endo's example, with parse:
R E B O L []
b: copy [] foreach line read/lines %smalltalk.idx [append/only b parse line "{}"]
c: copy [] foreach v b [append c reduce [v/2 copy []]]
c: unique/skip c 4
foreach v b [if p: find/tail c v/2 [append p/1 v/4]]
editor c
Nick — 15-Nov-2013/14:38:30-8:00
Either of those previous examples reads your smalltalk.idx file, parses it, and formats into blocks as you requested.
dragoncity — 15-Nov-2013/21:59:16-8:00
R E B O L [ Title: "scantexindex-nick3"]
; a working version from Nick, using file input instead of internal data block
print "**input text file of : \indexentry{nextPut:}{pageno} ...."
foundlist: copy []
foreach line read/lines %smalltalk.idx [append/only foundlist parse line "{}"]
prin "%%read:" print foundlist
keys: copy []
foreach line foundlist [append keys line/2] ; build key ( names ) list
keys: unique keys
prin "@@keys:" print keys
finallist: copy []
foreach key keys [ --; use key block
final: copy reduce [key copy []] ; to
foreach found foundlist [ --; scan foundlist block
if found/2 = key [append final/2 found/4] ; building (pageno) block
]
append finallist final ; making finallist block
]
prin "&&probe:" probe finallist
prin "$$print:" print finallist
halt
dragoncity — 16-Nov-2013/1:54:07-8:00
Brilliant !
Thanks Nick & Endo,
I had attached this text to the revised program, above, that I sent, but it did not seem to go?
You & Endo have done a nice job of showing off Rebol.
Esp. how you reduced my 13 lines of code to process the incoming text records to ONE !!
I have added a few PRINT commands to display the effects of the data passing through the
code to its intended end, should anybody be interested. This example will find
its way into my Collected Notes on Rebol.
Endo's 6 line result is an amazing demo of the power of Rebol & the need to understand
how to manipulate Rebol Blocks.
BTW: this rebol program will replace a 3 x A4 page Ada program which I wrote a few years ago and does much the same thing.
Thanks Again.
dreamyToto — 17-Nov-2013/13:39:52-8:00
Another slightly different version with code nested in PARSE block, seems to work !
R E B O L [
Title: "scantexindex-dreamyToto"
]
index: copy []
result: copy []
foreach line read/lines %smalltalk.idx [
parse line [thru "\indexentry{" copy key to "}" thru "{" copy page-no to "}" to end (
page-no: to-integer page-no
print rejoin ["[PARSED] key=" key " - page-no=" page-no]
either found: find/skip index key 2 [
append found/2 page-no
][
blk: reduce [key (copy reduce [page-no])]
append index blk
append/only result blk
]
)
]
]
probe result
dragoncity — 17-Nov-2013/23:16:41-8:00
Hi dreamyToto,
nice solution, its actually more in the style of my Ada program with the either ..else.. coding which was how I was developing my program when compared to Nick & Endo's more Rebol'ish solutions
It's interesting to see your use of Parse.
Thanks for your interest.
dreamyToto — 18-Nov-2013/15:01:22-8:00
@dragoncity
Thank you. I should do that more often. It's a way to progress for me, don't have many occasions to use Rebol...
This code came to me like this, with the re-use of the same block of page numbers between the "index" block (for searching) and the "result" block !
It's Rebol code, don't know if it's Rebol'ish ! That's an interesting notion : is my code Rebol'ish ? Is it written respecting Rebol spirit ? I cannot reply myself !
Nick — 18-Nov-2013/16:38:36-8:00
@dreamyToto
To me, Rebol's "spirit" is about keeping things simple. Rebol/core code is characterized by series manipulations, unnecessary syntactic cruft is avoided, and rebolers tend to craft short, readable solutions. Your code looks like Rebol to me because there are some familiar code patterns, and the whole parse evaluation, I think, is itself a Rebol'ism. You collapse some potential multiple lines of code into one, which is another Rebol'ism (doing any more of that in your code would just reduce readability). At first glance your parse approach can't be reduced to any simpler algorithm, so yes, it's Rebol'ish I think. To me, Endo's code seemed closest to what I think of as Rebol's "spirit" for short core examples like this, because he used refined series functions to craft the most concise solution possible.
Nick — 19-Nov-2013/4:45:17-8:00
I think for bigger problems, Rebol's goal is to enable users to build dialects for other users, which make use of the simple core API, series structure, native data types, etc., using the ability of parse to define elegant and simple language models which solve a given problem, without extra syntactic cruft. I don't think this has been explored nearly to the degree Carl had intended, in his initial design of the language. If users could grasp how powerful this concept is at simplying code patterns, I think Rebol-like languages would enjoy a much more important place among modern development tools.
dreamyToto — 19-Nov-2013/14:02:54-8:00
@Nick
I like the fact that you can do the job using only 1 loop, the one reading the file line by line.
If the file is not too big, you can even remove the "foreach" loop and use "any" keyword in PARSE like that :
R E B O L [
Title: "scantexindex-parse-any-dreamyToto"
]
index: copy []
result: copy []
file-content: to-string read %smalltalk.idx
parse file-content [any [thru "\indexentry{" copy key to "}" thru "{" copy page-no to "}" to newline (
page-no: to-integer page-no
print rejoin ["[PARSED] key=" key " - page-no=" page-no]
either found: find/skip index key 2 [
append found/2 page-no
][
blk: reduce [key (copy reduce [page-no])]
append index blk
append/only result blk
]
)
] ]
probe result
Concerning dialects, I dream of a big software like SAP written only in Rebol or Red (maybe more Red because it targets better server performances under heavy load and it will have concurrency/multi-tasking). You could exchange Rebol blocks containing business dialects (DSL) between modules (ordering, invoicing, accounting,...) ! Could be great ! Let's dream !
Nick — 19-Nov-2013/16:53:07-8:00
If only we could see the kind of interest develop around Red, now, which developed around Rebol during the early 2000's, the industry could see developer productivity increase in a big way. Rebol failed commercially largely because it was closed source, and the community was ignored by Carl for several long periods. I think Red will be in a great position to gain some traction during the next year. We need to keep Doc funded :)
Reply