Rachel's Blog: November 2008

Friday, November 28, 2008

week 12 muddiest point

My muddiest point is in regards to social software. According to Dr. He's powerpoint, libraries will have to become 2.0 as software makes this shift. However, it seems to me that software is behind libraries, not ahead of them. We are already pushing social based activities and communication, so where will this shift to web 2.0 really make a difference?

Sunday, November 16, 2008

My Muddiest Point--Week 11

My muddiest point for week 11 is in regards to Dr. He's lecture.
It's mentioned that vocabulary control via thesaurus can either be used in an equivalence relationship or a hierarchical relationship. Which one of these methods is more effective and why? My assumption would be that the hierarchical relationship would be more beneficial, but I'd like to know the right answer. :)

comments for week 11 readings :)

Here are the links to my comments for Week 11:

https://www.blogger.com/comment.g?blogID=5162573700267662965&postID=6747130404438335672&page=1

https://www.blogger.com/comment.g?blogID=2367464305070960355&postID=7998496854095718979

Friday, November 14, 2008

Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age

--2002 development of institutional repositories
--broader range of abilities with this new function (especially in academic settings)
--a stewardship of digital materials
--'collaboration among librarians, information technologists, and records managers, faculty, and university administrators and policymakers'
I thought this article was extremely informative and really spoke to the worth of collaboration. I was pleasantly surprised that with the number of groups working together, institutional repositories have been so successful (may seem somewhat pessimistic, but it's true!)
Either way, I feel that this is definitely something that can be effective, not just now but in the future as well.

Dewey Meets Turing

digital +library= librarians, computer scientists, and publishers
--DLI: Digital Libraries Initiative
--considered a "matchmaking" of computer scientists and librarians
--was it successful?
--In 1994 the WWW threw crazy stuff into the picture, which really blurred the lines between pc scientists and librarians (really undermined the common ground that brought the two groups together in the first place)
--bigger web equalled more heuristic approaches to organizing info
--librarians wanted a clear connection to traditional librarian functions
--no matter what the technology, the CORE function of librarians is still relevant
--collections are re-emerging
--more opportunities for connections between scholarly authors/works and librarians
--simply need to come together and find what they need while still working together

digital libraries:challenges and influential work

--greatly distributed scholarly info landscape, makes search and discovery of ideas difficult and taxing
--federated search diagram
--seamless federation of resources = "the holy grail" as author states
--several federally supported projects, incl. UC Berkeley, Stanford, Michigan, etc.
--computer and netowkring technology changed over last decade
--digital world rapidly evolving, this affects 1. publishers, 2. publisher consortiums, 3. bibliographic utilities, 4. academic consortia, and so on.
--several university studies focused on the issues of 'search inoperability' and 'federated searching'

--the goal is to extend services in the next few years to provide better quality access--quality of searching efficient access to information

Tuesday, November 11, 2008

link to website

Here's the link to my website!

www.pitt.edu/~rdr22/

Sunday, November 9, 2008

my comments :)

Here are my comments on Week 10 readings:

https://www.blogger.com/comment.g?blogID=7533952523781723717&postID=8280044869916472597&page=1

https://www.blogger.com/comment.g?blogID=5162573700267662965&postID=7208911322336128681&page=1

Current developments and future trends for the OAI protocol for metadata harvesting.

open archives initiative (OAI)--
so far has been fairly successful
--widely adopted since 2001
--its purpose is defined as : "to develop and promote interoperability standards that aim to facilitate the efficient dissemination of content"
--NSDL provides access to science based learning objects
--problems with the registries are completeness and sparse records
--ongoing challenges:
--metadata variation
--metadata formats
--OAI Data Provider Implementation Practices
--Communication Issues

--this article was great for informing me concerning the OAI. I wasn't deeply familiar with it, but it brought up many good challenges a lot of key points about what the OAI has and will be doing for archives.

Friday, November 7, 2008

the deep web: surfacing hidden value

--traditional search engines are not effective for content located within the deep web
--interesting stat:
fully 95% of the deep web is available to the public without cost/subscription requirements!
--search engines give "indiscriminate crawls" that do not enable access to the full breadth of pages/information out there
--surface web likened to boats on top of water, the deeper you go into the body of water, the more that's down there (the deep web)
--original deep content EXCEEDS all printed global content! whoa!
--serious information searchers must acknowledge the amount and quality of information available through the "deep web" and learn to access it

web search engines, parts 1 and 2

Part 1:
--went from the belief that webpages couldn't be indexed (1995) to very reliable search engines, such as google, yahoo, etc.
--generic search engine infrastructure--multiple, geographically centered data structures
--crawling algorithms process requests and continue until the queue is empty
--real crawlers must address: speed, politeness, excluded content, continuous crawling, spam rejection, and duplicate content

Part 2:
Indexing Algorithms:
--uses and inverted file: two step process including 1) scanning and 2) inversion
Issues with real indexers:
--scaling up: simply too many entries
--term lookup: search terms extend beyond the basic english dictionary to include numbers, characters, email addresses, etc.
--compression
--phrases
--anchor text
--link popularity score
-- query independent score

query processing algorithms:
most common= type that don't include operator words
Speeding up queries:
skipping items
early termination--sort the information as you search
caching