Tuesday, September 28, 2010

Week 5 Reading Notes

Note: As I've already written this, I'll be changing the date for next week's time so it can be found with the relevant discussion.

Database Wiki Article

I never had much experience with databases outside of utilizing them for schoolwork, so I’m grateful for seeing how they work, what sorts are available, etc. The article covered a variety of details, but the sections that seem relevant to me consisted of the various kinds of databases, storage structures and indexing notes. Oddly enough, I never did think of the internet as a database per se, primarily due to the problems of sorting and indexing the data, but according to this article, it can be considered as such. I wonder if there is going to be a better way of indexing such a database, especially since there are so many other databases within this database.

One thought the crosses my mind while reading this: what sort of approach will libraries take when it comes to sorting and organizing digital libraries? Clearly libraries will become a database with the increased number of digital resources, so I can’t help but wonder what will happen here.


Introduction to Metadata

Ah, yes, an article that speaks about “data about data.” When you really consider it, most of what we (and by “we,” I mean everyone who is searching for something, and not just librarians) work with is metadata. Our searches for data consist of metadata, looking for other similar metadata, in order to find the data we seek. We essentially work with tags associated with what it is we seek, and on words and phrases written about this information. The metadata I personally work with the most is something the article already referenced: the Library of Congress Subject Headings.

I do like how the article brings the subject of metadata outside of the libraries and into other fields. The idea of archiving information, including museums, is heavily reliant upon metadata, and now with the increased access and new approaches provided by the internet, metadata becomes even more important.

There was one note that stands out to me as I read this article: “there is no single metadata standard that is adequate for describing all types of collections and materials.” This leaves me wondering if it is too lofty of a goal to find a way to categorize and sort data as a whole, especially since the requirements and opinions of metadata vary depending on the topic at hand.


An Overview of the Dublin Core Data Model

I think this is the first article I’ve run into that I didn’t have at least some semblance of background knowledge of the subject. That being said. . .
At the beginning of this article, I started to believe that I spoke too soon regarding my thoughts on metadata standards across disciplines. The Dublin Core Data Initiative is essentially an attempt to break down the barriers between disciplines when it pertains to metadata.

The article proceeds to explain the requirements of the project, which boiled down to being capable of working on an international scale, identifying various sorts of information, refining data from broad terms, and remain modular enough to be used all around.

There is one flaw in this approach, which is why I consider this a lofty goal: the designers of this program will face against the same problem that has plagued librarians and other researchers over the years: how do we classify data? With that in mind, is this approach the correct approach, or is this another dead end in the system?

7 comments:

  1. The quote from the second article about metadata standards struck me as well. There have been attempts to standardize the models, but resources vary so widely that it's hard. The Dublin Core Model is the closest to a standard that we have.

    ReplyDelete
  2. It seems to me that if the Internet is to be considered an effective database that it has to allow for more effective retrieval. Everyone uses the internet as the ultimate database for a variety of information these days, but it's still hard sometimes to narrow down the results to what is wanted. That said, when comparing something like Google to the average OPAC, I have to admit that OPACs - which are specifically designed with retrieval in mind - often fail when it comes to some very basic retrieval functions.

    ReplyDelete
  3. The quote from the metadata article struck a chord with me as well. It brought to mind the limitations that simply exist in terms language. With subtleties between definitions and connotations it makes it difficult to address the way people may label things. I believe its important to work toward a more universal categorizing system but more realistic goals must be set.

    ReplyDelete
  4. I like your idea of a "lofty" metadata scheme that tries to cover everything. Maybe tagging is going to take on that role, but it would be interesting to see what could be created by professionals.

    ReplyDelete
  5. @Kel

    I do think that the internet is the "ultimate database," but I do agree with you that in order for it to be used for academic/scholarly type research, a more effective way of indexing the internet needs to be created. Google does a REALLY great job of doing this already, but the sheer number of sites makes it difficult if not impossible to index everything.

    I also agree with you that Google will often do a better job of finding something than an OPAC (PittCat in particular). Something needs to be done in OPAC land to make searching more easy and effective.

    ReplyDelete
  6. Though Google is a great way to find information I am constantly amazed at how much irrelevant stuff can surface even with advanced search techniques. I agree Google has to modify its page ranking algorithm and use more detailed indexing to alleviate some of the problems.I have found Google Scholar to be better for research.

    ReplyDelete
  7. Some interesting points brought up in comments about Google/OPAC (PittCat)/Google Scholar. Seems to me that, from the readings, the idea for Dublin Core is to standardize metadata for professional use so that when searches are done for info the results found on OPAC and Google Scholar are the same because of the standardization of the metadata. I don't think the average Google user could ever be bothered about learning and using standards that might help them find info quicker.

    ReplyDelete