Emily's DigIn Blog: 2015

Wednesday, May 6, 2015

Unit 13: Playing with Servers and Some Thoughts on the Course

Downloading a preinstalled VM would be appropriate if the main learning objectives of the course are related to experimenting with various applications that can be installed on servers like the various content management systems (Drupal, Wordpress), collection management systems (Omeka, DSpace, EPrints, Fedora), and other tools like harvesters. However, if the course is focusing on managing digital collections, I think that the server aspects are as important as the software. I think that both this course and IRLS 672 mention that the important thing in technology courses is not learning one way to do things or memorizing a set of practices that will always work—it’s about learning how to learn technology—how things work in a general sense, what the various pieces of the LAMP server do, etc. So just learning the applications user to manage collections is not very valuable if you can’t explain how they fit in with the LAMP architecture and the various financial and managerial issues that accompany server maintenance protection.

I suppose that preconfigured solutions would provide more time for developing the collection and the metadata elements and application profile that describe the collection. However, we already had to build and describe collections in our IRLS 515 Organization of Information course. I would argue that the experience of using the various software packages to implement some items from the collection for the purpose of comparing the various options for hosting digital collections is more valuable than working harder to develop the digital collection itself. I would almost prefer that each student had to describe and enter the same digital collection (or choose from 5 practice collection options or something), because deciding what to put in my collection was stressful for me. I understand the value of reading about what makes a good digital collection and then trying to follow those principles when developing your own collection, but, it seems like most post-school projects I would engage with would involve making a collection of pre-selected materials accessible digitally.

In terms of computer skills, I had no trouble conceptually or in practice with any of the exercises we did in this course. I really appreciated the opportunity to cement my knowledge and skills with LAMP servers and navigating using the command line that I developed in IRLS 672.

I think that the course’s balance of hands-on server configuration, hands-on collection management system administration, and management thinking and writing was pretty ideal for my learning needs. The secondary management set of readings and the accompanying Botticelli lectures felt out of place to me—we already have to take a management course to get the DigIn certificate, and IRLS 671 and 674 also encompass management topics, and this course talks about management in the tech portion, so why is there also a standalone management aspect to this course? Taking out the server configuration aspects would give students more time to complete this management portion, but I think that the hands-on experience that would be lost would not make this trade worth it.

Monday, May 4, 2015

Unit 4: Drupal and my collection

Drupal is somewhat suitable for my collection. As we have learned, Drupal is very customizable. While writing out all the necessary categories and refinements to accurately describe my collection took time, I really like the dropdown controlled vocabulary menu, and once the work of creating the framework is done, entering the collection items themselves is not too time consuming.

I think that the customizability of Drupal makes it a great choice for unusual digital collections that don’t fit the traditional Dublin Core Metadata Elements or that aren’t suited to the text-heavy item records that some collection management systems use. Since Drupal is open source, anyone can develop modules to customize the system to their needs. Luckily, you don’t need to be a programmer to customize Drupal—many useful modules already exist. According to the discussion posts by my fellow classmates, there are modules for Drupal that allow you to create an image gallery, to play videos or sound clips, and for almost any other task you can imagine.

A weakness of Drupal is that since it is primarily a web development content management system, it does not have a focus on preserving the resources stored within a Drupal digital collection.

I also felt like Drupal had a pretty pronounced learning curve. Much like the case study of implementing a web CMS I looked at in Unit 2, I felt like I had plenty of information about installing Drupal, but to actually learn Drupal I would need more time to poke around and figure out how things work for my specific needs.

I have seen some Drupal sites that are very visually appealing. However, the initial build of Drupal looks pretty blog-ish and simple, even when a different theme is installed. I think that there are some collection management systems we are going to look at that produce more “digital collection” looking pages automatically, instead of all the reconfiguring within menus and plugins I would have to mess with to get Drupal to look how I want it to look for my collection.

Some criteria that I think are important for evaluating a CMS are: whether or not it is open source, what kind of support is there for users (both paid services and more informal community services on forums), is the community active, how does the resulting digital collection look to users, how easy is the CMS to use as an administrator, and is the CMS preserving the materials it stores.

Unit 2: Case Study in Choosing and Implementing a CMS

I chose to discuss Huttenlock et al.’s article, “Untangling a tangled web: a case study in choosing and implementing a CMS.” The three person Systems and Technological Services department at Wheaton College began experimenting with more complex webpages in the early 2000’s. While they developed an in-house CMS that worked for a short period, in 2003 they decided to use a commercial product.

In order to evaluate the various CMSs, the department developed a rubric. They wanted the CMS to be open source, allow data from the old website to migrate to the new CMS, secure, established (not a first version release), and easy to use from a development and management perspective (Huttenlock et al. 63). I found it pretty amusing that, “many of the products started to look the same. No matter which CMS one chose, one could easily create the same boxy site with approximately the same amount of work” (63). Maybe it’s because we are working on our digital collections more than a decade after Huttenlock et al. looked at CMSs, but I would not say that the various collection management software we have tried in this course creates the same looking sites with the same amount of work. It could also be that the curated introduction to these systems we are receiving in this class already eliminated products that are very similar to one another.

Eventually the folks at Wheaton College decided that a CMS that could use their existing database structure was the best option, so they chose one called WebGUI. Interestingly, WebGUI had not been a frontrunner during the first part of the selection process because it did not initially seem as user friendly as some of the other CMSs (64).

While there were guides about installing WebGUI, the team found that learning how to actually use the CMS (which modules work together, how to set up pages, etc) tool the most time. The team found assistance in the WebGUI community—both at the annual WebGUI conference and on the WebGUI forums. I thought that this commentary brought up a very important element of choosing a CMS—the community. A tool with an active and robust community using it and talking about it is superior to a tool that doesn’t have that informal support network.

It seems like the biggest takeaway from this experience for the team at Wheaton was that allowing enough time to choose, install, and learn a CMS is essential. They had initially budgeted one summer as enough time to complete this project, but it ended up taking longer. Huttenlock ends by saying, “Usability is more that just how easy a system is to use, it is based on the context of the actual tasks that someone will need to use it for” (68). This is an apt observation for this case study since the team initially didn’t think WebGUI would be a good CMS judging by its look, but its functionality suited their needs best.

Works Cited

Terry L. Huttenlock Jeff W. Beaird Ronald W. Fordham, (2006),";\

Untangling a tangled web: a case study in choosing and implementing a CMS", Library Hi Tech, Vol. 24 Iss 1 pp. 61 - 68 Permanent link to this document: http://dx.doi.org/10.1108/07378830610652112

Friday, April 17, 2015

Unit 12--One Content Management System to Rule them All

I have been treating my experience with each new website we have tried as an audition for the final project. This means that while it is important to consider which site is the best overall (most user friendly, most attractive, easiest to customize, etc), it is even more important to consider which fits my project the best. The site that I like the most might not be the best site for hosting my collection. Clearly, all of the sites that we’ve looked at have specific uses that suit them better than others. I can see the value in each of the resources we’ve examined this semester.

For example, I liked DSpace better than Drupal. I thought DSpace made sense, it was relatively easy to navigate, and it was great for hosting documents. However, my collection is made up of more than just documents—I also have images and audio files. After using Omeka this week, I feel like it is a better fit for my collection than DSpace. But then, thinking back to the beginning of the semester, Drupal is so customizable that it can really host any type of collection, so saying that Omeka is the best site to host my collection isn’t necessarily true. I think I just like Omeka because a lot of the work (such as adding Dublin Core metadata elements) has already been done for me, which makes importing items easy. Plus, I like to look and feel of Omeka more than any other hosting site we’ve used—it feels much more clean, organized, and modern.

I’ve already talked a bit about Omkea, so I think I’ll go in reverse chronological order to discuss the other sites. So, the EPrints harvester was an interesting resource. I have seen federated collections before, but I hadn’t ever given much thought as to how the collections were brought together. As many of my classmates said on the forum, it seems like making your archive harvestable was trendy a few years ago but has largely tapered off. I didn’t have any trouble with the harvester, and I thought the resulting collection was decent—very browseable, relatively searchable, and not too ugly or outdated looking.

Eprints itself was kind of a mixed bag—I didn’t hate it but I didn’t love it. Eprints was great for the few academic journal articles housed in my collection. The whole broken subjects aspect soured the Eprints experience for me—it was such a chore to enter even one item, and the resulting collection was missing an important part of metadata. I can imagine that Eprints is pretty decent when everything is working, but the resulting site is not as aesthetically pleasing or as intuitive as Omeka’s.

Dspace was pretty middle-of-the-road for me. I can see why a lot of universities use Dspace—it’s good with metadata and preservation, both of which are essential to institutional repositories. I thought Dspace was a bit less user friendly than Drupal, but it did fit the needs of my academically-focused collection with less effort on my part than Drupal. Before working with Eprints and Omeka, I thought Dspace was a contender for my final project.

I didn’t like Jhove—at the time I didn’t really understand what it was or why we were using it. One of the PDF’s I put into Jhove generated pages and pages of nonsense… Writing this post has made me realize I need to look at Jhove again so I can talk better about it for the final project.

Drupal is like a sandbox—everything is customizable, there are tons of cool add ons and things to play with, but you have to figure out the menu system before you can really play. While using Drupal I often got the feeling I know there is a way to do this, but I don’t remember where it is in the menu, which lead to lots of searching and guide reading. I imagine that people who are comfortable in Drupal like it, and I have no doubt that Drupal can create good digital collections, but I think there are better options to host my collection.

So at this point in time, its looking like I’m going to choose Omeka. We’ll see if working more with Omeka next week changes my perception at all.

Friday, April 10, 2015

Unit 11: OAI Metadata Harvesting Services

This week, I looked at a few different service providers of databases/archives based on the OAI harvesting protocol. Some were good, some were bad, and it got me thinking about what makes a good and useful federated collection.

First, I took a look at Heritage West, an archive of digital objects related to the Western United States. I was interested in this federated collection because it combines objects from University and museum libraries with objects from local small historical societies and museums. I found Heritage West using The University of Illinois OAI-PMH Service Provider Registry, but the link on that site now connects to a blog about the merits of steel shooting targets. With a quick Google search I found the real Heritage West (http://heritagewest.coalliance.org/) which is a very simple looking Omeka based archive. Some of the links within this archive don’t appear to work—I couldn’t find anything using “Browse Categories” but I could view individual items using “Browse Items” and “Browse Collections”. Some of the individual items had extensive metadata but no way to view the item itself within its home collection. The Advanced Search function left a lot to be desired—I could only serach by keyword, collection, and one “Narrow by Specific Fields” dropdown menu where I could select DC or item type metadata and then write in what I wanted to search, which was very clunky and not intuitive (http://heritagewest.coalliance.org/items/advanced-search). Overall, the Heritage West did not fulfill its mission of giving me access to a bunch of different Western US resources in one place—I couldn’t see the items themselves (only metadata), the search was clunky and metadata was not consistent across collections, and some of the site’s ways of organizing the data no longer worked. It would be simpler to actually search the federated institutions’ websites than use Heritage West’s interface.

Next, I looked at UCLA’s Sheet Music Consortium (http://digital2.library.ucla.edu/sheetmusic/) which I found on the OpenArchives.org list. I can definitely see a need for a project like this, since digital sheet music can be hard to find and hard to verify that it is legal to use and correct, etc. The records within this provider did not have a lot of metadata (mainly title, creator, identifier, and name of library holding the resource). Interestingly, many of the records housed in this repository that purports to “promote access to and use of online sheet music collections” were to resources that are not available online. However, you can choose to browse the Virtual Collection (http://digital2.library.ucla.edu/sheetmusic/virtualcollection.html) that contains all digital music. There is a social element to the Virtual Collection—you can view other users’ collections of music. You can also check a box when searching to only return digitized sheet music fitting your search parameters. In general, the digitized music had much more thorough metadata than the record-only pieces, which makes sense because more metadata was likely added when the objects were digitized. Overall, I found the Sheet Music Consortium good and useful because it allows users to search records effectively, the actual digital objects are accessible through the site, and the metadata for the digitized music was consistent and thorough enough to allow for productive searches.

Finally, I looked at the NASA Technical Reports Server (NTRS) (http://ntrs.nasa.gov/). This site was generally good—the search and advanced search options were very customizable and well-developed, plus there was a “Search Tips” section that provided information on how to search and refine searchers, which helps with accessibility (http://www.sti.nasa.gov/ntrs-search-tips/#.VSiMwJTF8ww). NASA has their own metadata terms that they have added to records, so its good there is a document that helps to explain them. Their “Browse By” page was also good—lots of categories to pick including date based categories, document type, center that houses the information, and availability. In terms of metadata for records, I looked at some PhD dissertations that had very extensive metadata, a few computer programs that also had good metadata, and some datasets that had decent metadata. The problem I had with the dataset records was that they contained a lot of good information, but it wasn’t well categorized, which could create problems when searching. Overall, NTRS was a very thorough and well organized provider of NASA technical reports. I found it to be a useful tool because the metadata for records was extensive, the search interface was easy to use (especially the search tips!), and the browse section was well organized.

Through this exercise, I found that I really like it when you can search by availability, because I don’t usually use an online repository or database to just find out that something exists, I want to see the thing!

Huge federated collections are good in that they provide “one stop shopping” for a lot of different resources at once. They are less good when they become less cohesive in terms or subject matter, or when metadata between the different collections is not consistent, which makes searching less effective.