Arc Forumnew | comments | leaders | submitlogin
Create your own collection: use directories as if they were tables with file-table
13 points by almkglor 5873 days ago | 9 comments
Hello all, I've just built a special "collection" type which treats directories as if they were tables, with keys being files and values being the contents of the files. Deleting, creating/updating files is as easy as (= f-tb!file nil) and (= f-tb!file "the new contents").

The code is on the arc-wiki, making use of my settable-fn.arc, ($ ...), redef, and files.arc.

  arc> (load "lib/file-table.arc")
  nil
  arc> (= test (file-table "."))
  #4(tagged table #<procedure: reader> #hash((keys . #<procedure: keys>) (= . #<procedure: writer>)))
  arc> (keys test)
  (".gitignore" "BUGS" "CONVENTIONS" "ac.sbcl.lisp" "ac.scm" "app.arc" "arc-exe.scm" "arc.arc" "arc.sh" "as.scm" "bitops.scm" "blog.arc" "brackets.scm" "code.arc" "copyright" "files.arc" "html.arc" "libs.arc" "pprint.arc" "prompt.arc" "robots.txt" "srv.arc" "strings.arc" "whtml.arc" "chismaksball.arc" "news.arc" "tl" "grayarrow.gif" "graydown.gif" "infix.arc" "s.gif" "y18.gif" ".arc.arc.swp")
  arc> test!BUGS
  "This file details known issues in Anarki. \n\n***************************************************************************\nCOMPATIBILITY ISSUES\nThis section documents incompatibilites between Anarki and Arc2.\nSee also the CONVENTIONS file.\n\nThe current thinking is that new macros generally don't need to be\nmentioned here. It should be understood that defining new macros could\npotentially break code. To get a list of the added macros in Anarki,\nuse: (load \"lib/new-macros.arc\") (new-macros)\n\n*2007/02/19* When loading lib/defpat.arc, the symbol p-m can't be safely\n used as a variable for a container, or as the symbol for a function.  This\n problem will also apply for any macro defined in Anarki that is not\n already defined in Arc1.\n\n***************************************************************************\nARC1 ISSUES\n This section documents bugs in the official Arc distribution. They are \n also present in Anarki unless otherwise noted.\n\n*2007/02/19* Macros will override local variables in (let ...) and other\n local-variable-creating forms.\n\n*2007/02/18* Interpreter reads \".a\" as the EOF character. \n Arc1 treats EOF as an error, which this probably should be anyway.\n However, Anarki quits on EOF. This also applies to things like \"is!:.a\",\n which should arguably be equivalent to \"(is ': a)\".\n UPDATE: Other surprise EOFs include \"\\!\". Still present in Arc2.\n\n***************************************************************************\n"
  arc> (= test!niaw "cat, cat")
  "cat, cat"
  arc> (file-exists "./niaw")
  "./niaw"
  arc> test!niaw
  "cat, cat"
  arc> (= test!niaw nil)
  nil
  arc> (file-exists "./niaw")
  nil
  arc> test!niaw
  nil


4 points by almkglor 5873 days ago | link

TODO: a version which accepts most arc structures as keys and/or values ^^

edit: or even just arbitrary strings as keys ^^. Currently keys are filenames, but we might be able to create hashed keys.

-----

2 points by eds 5873 days ago | link

This is really awesome! Thanks almkglor!

File tables have some pretty interesting implications for persistent processes. (This would be even better if you could use data structures other than just strings as values.)

Would it be possible to associate file tables with specific directories? In the current implementation, the directory for the file table changes whenever you use change the working directory, so you can't really have file tables in multiple directories at the same time.

-----

3 points by almkglor 5873 days ago | link

Should be possible if you give a fully qualified path (file-table "/home/eds/somewhere")

Anyway it's probably better to figure out how Arc can get the fully qualified path from a path string, i.e. (get-fully-qualified-path ".") => "/home/eds/arc-installation/". Probably by ($ ...)

Unless you're referring to something else?

As for "other values".... the only real problem anyway is serialization. Ordinary Arc values that can be input into Arc via text are of course trivially serializable, but something created via:

  (= x (cons nil nil))
  (= y (cons x x))
  (= (cdr x) y)
is obviously not trivially serializable.

-----

2 points by eds 5872 days ago | link

Yeah, (file-table "/home/eds/somewhere") is what I wanted, although having "." expand to `pwd` would be nice as well.

As for serialization, doesn't scheme have the ability to print circular objects? Not that that was really what I was thinking of.

Right now, all data that gets read from a file is in string form, so you have to explicitly

  (read ftab!fname)
to get an object stored in a file. Also, if you want to write an object like a list of strings to a file, you have to do something like

  (= ftab!fname (tostring (write '("hello," "world!"))))
to get the contents of the file to be readable.

Its not that serialization is difficult, but this seems like repetitive code that should be abstracted away. Maybe we should have one type/mode/version of file-table that reads everything as a string, and another that automatically prints a readable representation so that the object can be read back in later.

-----

3 points by almkglor 5872 days ago | link

Should be possible to abstract a layer around file-table for that (untested):

  (def file-table-w/read (path)
    (let ft (file-table path)
      (add-attachments
        '= (fn (v k)
             (= (ft k) (tostring (write v))))
        'keys (fn () (keys ft))
        (annotate 'table
          (fn (k) (read (ft k)))))))
Of course, the above doesn't memoize, so repeated calls to ftab!fname will return different ((iso ftb!fname ftb!fname) => t, (is ftb!fname ftb!fname) => nil) objects.

And of course, you're not supposed to write circular objects with the above.

And then someone will want to use temload and friends... hmm. Need a good way of abstracting the abstractions...

-----

2 points by eds 5872 days ago | link

Thanks. It might be nice to have that on Anarki.

EDIT: And it also might be nice to have the path parameter be optional (and default to ".").

-----

3 points by almkglor 5872 days ago | link

Hmm. What I'm currently planning is to have 'tofile and 'fromfile tagged arguments instead:

  (def file-table-w/read (path)
    (file-table path
                'tofile    [tostring (write _)]
                'fromfile  read))
Then memoization will be done on the file-contents (ct and mt) tables instead of actual file-table objects.

It would be useful also to have non-memoized versions, accessible via 'nocache:

  (def grep* (rex path)
    (zap re rex)
    (accum collect
      (ontable k v (file-table path 'nocache)
        (if (re-match rex v) (collect k)))))
'nocache would be useful for such cases where you want to scan through files but not cache their actual contents.

It might also be useful to store cached contents only for a certain time, to preserve memory (but then gc will get run anyway).

p.s. supporting tagged options will make an optional path argument difficult. I suppose I can check if the first argument is a string, though, and treat it as the path if it is.

-----

1 point by almkglor 5873 days ago | link

Untested grep-like function:

  (def grep* (rex path)
    (let rex (re rex)
      (accum return
        (ontable k v (file-table path)
          (if (re-match rex v) (return k))))))
Persistent due to memoization, I'm afraid, so expect it to grow in memory fast.

-----

5 points by almkglor 5873 days ago | link

If strings were lists, and lists had a promise in cdr position instead of an actual object, then more laziness could be had.

-----