nancylebov: blue moon (Default)
nancylebov ([personal profile] nancylebov) wrote2004-05-11 02:31 pm

A Question for Academics

I've heard laments that the rise of the telephone meant that there weren't letters to study.

Now that the net has taken hold, is there getting to be unmanagably much writing from people who you want to study?

[personal profile] cheshyre 2004-05-11 11:51 am (UTC)(link)
Not necessarily.
Think about incompatible formats. Outdated programs or media. [I still have 5 1/4 floppies from college with work written in DisplayWrite.]
And new media is often much more fragile than paper. Not to mention new media companies.
If Google or the WebArchive or LiveJournal goes under, how much will be lost?

[personal profile] cheshyre 2004-05-11 11:52 am (UTC)(link)
FWIW, if you want more thorough answers, this may be more a question for archivists than academics.

[identity profile] nancylebov.livejournal.com 2004-05-11 12:52 pm (UTC)(link)
Thanks for the hint.

Archiving is a big job, but so is reading and thinking and writing about the material once it's found.

As I understand academic writing, a lot of the point is to pin down all the corners and be very complete and exact. How do you handle it when there's no hope of completeness?

[identity profile] nancylebov.livejournal.com 2004-05-11 12:49 pm (UTC)(link)
It's all fragile but meanwhile it's there, and some people write a *lot*.

More fragile all right!

[identity profile] dglenn.livejournal.com 2004-05-11 06:18 pm (UTC)(link)
This is why I keep personal archives of my LJ entries (using the built in tool; I haven't gotten around to writing a tool that'll suck down all the comments as well), and have always tried to keep an archive of all my Usenet posts and the followups they've generated.

It's all still fragile, especially considering that for all my "just in case I want it later" and "just in case a hundred years from now I wind up being interesting to a historian" thinking, I'm still archiving casually, not really as an archivist would. And some things have slipped through the cracks -- drafts of web pages that may not have gotten archived, Usenet posts that I forgot to log, and three different periods from which I have mail or other writing archived in formats I can't currently read. (Two different cartridge tape formats and a spool of 9-track ... but the data are all in plain ASCII, IIRC.) Eventually, when I obtain the elusive round tuit, I do want to get all the old stuff read, all the "current" (early 1990s to present) material sorted, and burn CDs in a couple of different formats and maybe even put a copy on a spare IDE drive I can then unplug and leave at my mother's house or something. But that's still not proper archiving and I'm perfectly aware of that.

So yes, it's fragile.

Personally, I feel nothing quite beats carving in stone for single copy durability, but nothing is really safe until it's been published ... and a hundred thousand or so copies sold, on multiple media, and deemed important enough by enough other people that a) several sites will take care to archive it carefully, and b) successive generations will read it often enough to make sure they remember how and/or copy it to forms they can still read.

Dante was considered important enough to translate. Homer was considered important enough to translate and to have his language taught in high schools in countries that don't even speak the modern version of it. ("Format incompatibility" is a much bigger problem in the digital age but isn't completely new.) It'd be hubris for me to take much more serious measures than what I've got planned without first achieving that kind of cultural importance (not that I'm immune to hubris, of course), but just making sure that it's all in a form that can be conveniently read by most people at the time of my death ought to do for someone like me.

This is something folks on photography mailing lists and newsgroups worry about too. For all the advantages of digital (especially in terms of "workflow" and meeting deadlines in high-turnaround businesses), there's still the fact that we can still make prints from glass plates exposed more than a hundred years ago, while it's difficult to find anyone who can read the 5.25" TRSDOS and CP/M diskettes I was using less than twenty years ago.

Re: More fragile all right!

[identity profile] nancylebov.livejournal.com 2004-05-12 03:46 am (UTC)(link)
I hadn't thought about Homer as the most superb example of memetic persistance. That's very satisfying.

What's proper archiving?

Even if you get your personal archive organized, what would you do to improve the odds of an interested scholar finding it?

I keep waiting for digital jukeboxes which are designed to read a significant number of hardware/software formats. I think there'd be a market, and I don't think it would be technically extremely difficult to do a pretty adequate job. (Technically finicky, yes--but I don't think it would take wildly new invention.)

When I say adequate, I mean that it could cover a lot of the more common formats--I'm not expecting it to cover everything--there are a lot of obscure languages.

Re: More fragile all right!

[identity profile] dglenn.livejournal.com 2004-05-16 10:09 am (UTC)(link)
Proper archiving in this case would be printout on acid-free paper with copies stored in multiple locations (far enough apart that the same flood or earthquake won't take out all copies simultaneously), plus copies on microfilm/microfiche; the electronic copies stored in multiple current formats (the most generically/easily readable of current formats -- right now that would be parallel plaintext and XML versions), again in multiple locations in different geographic regions, web-accessible and vault-stored offline, with the offline copies on media formatted for each of the current major operating systems ...

... and a set of scripts set up to manage migrating the archive to the next new storage medium, OS, or file format so that today's solution doesn't become the five-decades-hence "well it was thorough at the time but now it's obsolete" stash. (Obviously the migration tools themselves would have to migrate.) And because I'm paranoid, whenever an operating system or storage medium became obsolete, functioning computers supporting that OS/medium, plus a supply of spare parts, would be added to equipment collections (geographically distant, yadda yadda yadda).

Clearly my own personal writing just doesn't warrant that kind of effort and expense. But if I were storing something of vital cultural, scientific, or commercial importance, that I couldn't count on simple proliferation of published copies to make safe for me, that's how I'd go about it. I'm not sure what would be that important but also not published, but various subsets of that protocol are used for certain real-world applications.

"Even if you get your personal archive organized, what would you do to improve the odds of an interested scholar finding it?"

Become famous. Madonna/Mick Jagger/Beethoven famous. :-) They'll find it.

Failing that -- assuming the scholar is looking for "slice of life" writings or "how did people think then" material rather than "famous life under a microscope" -- I'd try to be entertaining enough to be [livejournal.com profile] theferrett-level or Samuel Pepys-level famous and try to make sure that pointers (or clues, when the pointers become stale) to copies of the archive are attached to some non-trivial number of the things that get quoted or archived by other means.

But really, barring fame, that's a whole 'nuther question. There are organizations dedicated to preserving snapshots of the web, including personal web sites of people who have died, to make such material available and findable, but that puts it back into the relying-on-survival-of-an-organization category.

Re: More fragile all right!

[identity profile] papersky.livejournal.com 2004-05-16 09:43 am (UTC)(link)
I don't back up my LJ and I don't save my usenet posts, and I don't even save my own email, because it all seems to me to be ephemeral conversation and I can't imagine it being of interest in the long term -- or if I do it's in the sense of "I suppose I ought to delete this".

This is the opposite attitude, I suppose.

Re: More fragile all right!

[identity profile] dglenn.livejournal.com 2004-05-16 10:50 am (UTC)(link)
I've got an additional reason for backing up my Usenet posts and email -- since copies I can't delete are going to be floating around indefinitely, I should have my own local copy so I can check for what I've forgotten that might surface in a job interview or whatnot, and also so that I have a record of what I actually said in case someone tries to twist my words against me in the future. Paranoid, I know. I got into a few flamewars back before the Great Renaming, and the habits stuck. But that's a reason for a personal archive, not a preservational one...

As for imagining it to be interesting in the long term, I don't really expect it to be. It's just that nagging voice that keeps saying, "Isn't it a shame I can't read the diary of a 13th Century farmer to get a clearer picture of how the commoners lived and thought, rather than having to rely on paintings and what the educated classes wrote about the farmers?"

Then again, although I put a certain amount of care and craft into much of what I write, I'm not in the business of producing "'real' writing meant to be preserved" (by which, this context, I mean literature, intended for publication and meant to still be read more than a few years from now), so to me, my writing doesn't have a division between "this is a 'keeper' and that is just conversation", and that may also affect my inclination to toss it all in the archive pile. (Okay, there is some division -- there's what I post on my web site vs. everything else -- but the division isn't so sharp.)

[identity profile] nosebeepbear.livejournal.com 2004-05-13 09:51 am (UTC)(link)
If ... LiveJournal goes under, how much will be lost?

I have nightmares about that. I back up my own stuff, of course, but that's not what you mean.

I use LJ not only to keep in touch with my friends, but to re-create timelines and figure out how things in the past fit together. Impossible to do with just my own journal.

[identity profile] dglenn.livejournal.com 2004-05-11 06:22 pm (UTC)(link)
Yes, but I don't think it's because the few people I would have been interested in have more writing available; rather because there are so many additional interesting people for me to accidentally stumble across ... and I tend to be interested in way too many things at once.

If I pick a small handful of living people, I'll always be able to read faster than they can write. My whole friends list, OTOH, I have trouble keeping up with just what they post here on LJ.

A dead person with a seventy-year head start, whose entire output I'm trying to read all at once ... well one expects that to take a while, right?

[identity profile] nancylebov.livejournal.com 2004-05-12 03:51 am (UTC)(link)
I was bringing up the academic problem--not just reading, but reading and then thinking and writing about what one has read.

Also, just suppose one was doing an overview of my net writing. Google groups turns up 36,400 posts that I've written. I'm far from the most prolific person on line, and most of those posts are fairly short. Still, that's rather a lot, and I wonder how much you'd have to read to do adequate context for them, especially if you wanted to evaluate what I've occasionally said about newsgroups.

[identity profile] nosebeepbear.livejournal.com 2004-05-13 09:53 am (UTC)(link)
Google groups turns up 36,400 posts that I've written.

Hear that, folks? There will henceforth be no more cracks about my measly 900-some-odd LJ posts ;)

[identity profile] dglenn.livejournal.com 2004-05-16 10:58 am (UTC)(link)
At some point I skim, looking for the interesting bits, and only following on to the context "when it seems needed". I guess there's something to be said for leaving some work for the next scholar to do...

Too much info? yes and no

(Anonymous) 2004-05-22 11:47 am (UTC)(link)
The huge importance of computers and the Internet has definitely changed how archives work/will work and how data will be stored for generations, but it's hard to say if it's a change toward more information or just different information. Google! groups and blogs and other stuff means that you can find almost anyone's opinion on almost anything. And an equally big revolution is that you can do it sitting naked in your living room rather than going to a library or town hall or whatever.

But, on the other hand, your post reminded me of something I read on a Terry Prattchett fan web site. Here's a link (http://www.co.uk.lspace.org/books/pqf/alt-fan-pratchett.html) to the page. The quote is, "I save about twenty drafts -- that's ten meg of disc space -- and the last one contains all the final alterations. Once it has been printed out and received by the publishers, there's a cry here of 'Tough shit, literary researchers of the future, try getting a proper job!' and the rest are wiped."

So if stuff like that is lost, what else is?

(Anonymous) 2004-05-29 11:59 am (UTC)(link)
I lament that the rise of the IM means that there is less email to study.

Carl Devore