Friday, May 4, 2012

I Need a Human Readable, Yet Parse-able Document Format


I'm working on one of those projects where there are a million better ways to accomplish what I need but I have no choice and I have to do it this way. Here it is:



There is a web form, when the user fills it out and hits a submit a human readable text file is created using the form data. It looks like this:




field_1: value for field one

field_2: value for field two
more data for field two (field two has a newline in it!)

field3: some more data



My problem is this: I need to parse this text file back into the web form so that the user can edit it.



How could I, in a foolproof way, accomplish this? A database is not an option, I have to use these text files.



My Questions:



  • Is there a foolproof way to do this using the format in the example above?

  • What human readable format would work better (in other words I can change the format)

  • Human readable means that a non programmer could read it and know what is what.



This project uses PHP.



UPDATE



By human readable I mean that anyone could read the text and not be overwhelmed by it, including your grandmother.


Source: Tips4all

5 comments:

  1. I Need a Human Readable, Yet
    Parse-able Document Format


    This is what YAML was designed to be. You can read more about it on their site or on Wikipedia.

    To quote Wikipedia:


    YAML syntax was designed to be easily
    mapped to data types common to most
    high-level languages: list, hash, and
    scalar. Its familiar indented
    outline and lean appearance makes it
    especially suited for tasks where
    humans are likely to view or edit data
    structures, such as configuration
    files, dumping during debugging, and
    document headers


    The advantage over XML is that it doesn't use tags which might confuse users. And I think it's cleaner than INI (which was also mentioned) because it simply uses colons instead of equals signs, semicolons and quotes.

    Sample YAML looks like:

    invoice: 34843
    date : 2001-01-23
    bill-to: &id001
    given : Chris
    family : Dumars
    address:
    lines: |
    458 Walkman Dr.
    Suite #292
    city : Royal Oak
    state : MI
    postal : 48046
    ship-to: *id001
    product:
    - sku : BL394D
    quantity : 4
    description : Basketball
    price : 450.00
    - sku : BL4438H
    quantity : 1
    description : Super Hoop
    price : 2392.00
    tax : 251.42
    total: 4443.52
    comments: >
    Late afternoon is best.
    Backup contact is Nancy
    Billsmer @ 338-4338.

    ReplyDelete
  2. I'd say either use


    INI files or
    YAML or
    Markdown or
    Textile


    or just about any lightweight markup language you deem appropriate.

    ReplyDelete
  3. You might want to look into YAML

    http://www.yaml.org/

    I agree with Pablo Fernandez response. I think JSON might be a good choice as well.

    ReplyDelete
  4. I'm just gonna say that an INI string is pretty readable:

    Pet_Name = "Fred"


    But, you could always roll your own format. Something like:

    Key: ValueValueValueValueValueValue
    Key: ValueValue


    Basically, you would explode the string by newlines, look for text strings infront of colons and use that as the key, and the data after the colon and before the newline is the value.

    ReplyDelete