IRLS 675

Tuesday, November 26, 2013

Unit 13

I have really enjoyed my experience in this class, as well IRLS 672 over the summer. Although it has been very challenging, I have learned so much which I know will be helpful in my future career (as well as in my other classes). It was helpful to take a wider view this semester into what the technology learned in 672 is used for in the real world. I was not even aware of institutional repositories before the course (or at least in the sense of them as a concept an what their purposes are). It was interesting to see how different the repository software is and how careful you need to be in your selection of which one is best for a particular collection. I also found it very helpful to see the different ways that metadata can be personalized in each type of software to better fit the collection.

I think that one particularly helpful exercise has been developing our own standards for what we want in the software. It has been interesting to see how different all of our collections are and how we have different priorities, whether it is appearance or long-term preservation. The management readings, while often difficult to absorb, were helpful not only for helping me see how repositories are being developed by different types of institutions, but also in seeing how they present themselves and how to read "in between the lines" in the papers themselves. I definitely feel more prepared to work with these types of repositories; I am by no means an expert, but I am glad to have learned enough to be able to understand more of what I am looking at. Overall I am pleased with my experience with this course, and I think it will end up being one of the most directly useful ones I take in the program.

Tuesday, November 19, 2013

Unit 12

It is nice to know that a pre-installed VM is an available option, but I think I would prefer to build my own. I have learned a lot by being able to install and configure the VMs and repositories, and I think the more you know about these processes the greater control you have and the better you can troubleshoot when problems come up. At this point in my learning I would still need quite a bit of guidance; but then again, part of the learning process is knowing where to find the resources you need. A pre-installed VM may be a good option if you are very limited on time and either don't have the basic knowledge needed or you already feel comfortable enough that you don't need to go through the process in order to be familiar with what is going on.

Tuesday, November 12, 2013

Unit 11

This week we were asked to look at the home sites of the repositories we have experimented with so far this semester. I found that my favorite sites are attractive and clean-looking, with easy-to-find basic information and forums. Omeka's site meets these qualifications: the community forums are easy to find and are organized into different topics such as "Troubleshooting" and "Plugins," making the help you need easier to find. It emphasizes the community aspect further with its "Get Involved" page, to help you improve and contribute to Omeka. Information on installing and using it was easy to find.

Drupal also emphasizes community, with various ways of connecting to other users such as forums, mailing lists, and chat. It also has a "Getting Involved" page with information on how to build modules, design themes, etc. The site for PKP Open Archives Harvester is also attractive and easy to use, with forums, a wiki, and information on education and training which is easy to find.

DSpace has a nice, intuitive site with good basic training materials. However, community does not seem to be as strongly emphasized as the previous three sites, with mailing lists as the only apparent option to communicate with other users. EPrints is similar in its use of a mailing-list-only community, but is a useful site if a little more sparse than the others. I like that they have a page for training and tutorial materials, as well as a FAQ page. Jhove's site was the least useful for my purposes: it is not very user-friendly for a beginner and uses more technical language. There was no community support that I could find.

I think that a repository's home site is very important in my selection criteria: I feel more confident using a repository which seems friendly to new users, community-oriented, and has clear training materials. After my evaluation, I felt that the Omeka, Drupal, and PKP Open Archives Harvester sites were particularly strong on all of these points

Tuesday, November 5, 2013

Unit 10

Looking at the service providers at http://www.openarchives.org/service/listproviders.html, I first looked at digitAlexandria (http://digitalexandria.com/). "The digitAlexandria is a cross-platform system, composed by a set of very handy and fast tools, designed for building digital archives of any complexity, from the personal archive of a single researcher up to the repository of a big institution, such as a University or Research Centrer. It is based on a peer to peer network and is compatible with the Open Archives Protocol." Its goal is to offer simple to use tools for scientific researchers and institutions to build their own archives and collaborate with other researchers. They have about 1,000,000 scientific documents available for this purpose. They provide a long list of archives harvested by their system, along with their base URLs. I did not see any way to search or browse the collection. It does seem to provide a useful service in collecting these scientific resources together and making it easy for others to archive them as needed.

The next one I looked at was Perseus (http://www.perseus.tufts.edu/hopper/search?redirect=true). According to the Open Archives site, "The Perseus system harvests registered OAI repositories and incorporates the information into its search interface." The subject matter covered is Classics (history, literature and culture of the Greco-Roman world). There is a convenient browsing capability broken up by different subject matter, and the search function brought up relevant results, without an overwhelming number of records. This seems like a great resource for primary source research in the humanities.

I also looked at BASE: Bielefeld Academic Search Engine. According to http://gita.grainger.uiuc.edu/registry/services/, "BASE is the multi-disciplinary search engine to scholarly internet resources at Bielefeld University. BASE complements the current metasearch system for catalogues and databases of the Bielefeld Digital Library by disclosing multiple scholarly full text archives, digital repositories and preprint servers on the World Wide Web." The site has a very clean and simple layout, with options for browsing and basic and advanced searches. The search seemed fairly standard, and I thought it was interesting that you could browse either by DDC classification category or by document type.

Overall, I think that a federated catalog is most useful when it focuses on a certain disciplinary field or subject matter, and when it has good browsing capabilities to allow you to see what types of records it holds, as well as easily accessible information on which collections it harvests from. I think that having a huge amount of records is good for recalling a lot of varied information on a topic, but could become unwieldy and less useful if the information becomes too overwhelming for the user to sift through.

Monday, October 28, 2013

Unit 9

This week we entered our collection into EPrints. I have found from the three repositories that we have entered our collections into so far that my metadata differs slightly depending upon the set-up of the repository, as well as ease of use. I found that Drupal allowed me to customize my metadata to my collection the best. Due to the academic focus of DSpace and EPrints, the metadata was less detailed and slightly different. For example, I chose the option for my image as be "Submitted" in EPrints, whereas this was not an element in the other repositories.

I can definitely see the challenges and expenses that could be involved on a larger scale. A larger organization would want to either create or adopt one certain controlled vocabulary (whereas mine has varied depending upon the ease with which I can personalize the taxonomy). I refer to the application profile which I created at the beginning of the semester in order to keep things as consistent as possible (for example, I wrote a copyright statement which I use when that is an option). However, if I was focusing on a collection in one repository, I would need to spend more time on consistent metadata, as well as documenting it thoroughly for anyone else who is entering data.

Wednesday, October 23, 2013

Unit 8

The EPrints installation went pretty well for me: the only hiccup I ran into was getting an error message on my password when trying to use the sudo command. However, I looked at the Tech Dicussion board and realized that the problem was that I was logged in as eprints, which isn't part of the sudo group. I found DSpace to be the most difficult installation and the one I had the most technical issues with, so EPrints was a little bit easier. So far I like the layout and the fact that it includes LOC subject headings (even though I need to make my own subject headings anyway). I am still not entirely clear on how adding your own taxonomy works, so I expect to spend a lot of time on that this next week. I think it's interesting how Drupal, DSpace and EPrints all have their own methods of dealing with taxonomies.

I did a little bit of branding: in addition to changing the welcome message, I added one of my images as a logo. The first method I tried did not make any change, so I tried the second method and it worked. It was pretty straightforward with the instructions handy, but I might have difficulty figuring it out on my own. For these types of changes I think I prefer the GUI environment such as Drupal has.

Tuesday, October 15, 2013

Unit 7

This week we spent a little bit more time with DSpace, inputting items into our collection. Unfortunately I am having some technical issues: I keep getting an error message when I go to actually submit an item. However, I have posted in the Tech forum for the class so hopefully I will be able to resolve it soon. Besides that, I feel that I am understanding the submission process and workflows reasonably well.

Since we were able to choose our own topic for a blog post this week, I decided to look a little bit more into the types of metadata that are ideal for images. I found an interesting report on image metadata for the FILTER (Focusing Images for Learning and Teaching - an Enriched Resource) project: http://www.ukoln.ac.uk/metadata/filter/report/report.html#1.4. This report provides a list by Howard Besser of the University of Michigan where metadata standards should be developed:

-The technical information required to view the image (such as image type, file formats)
-Information about the image capture process (information about the type of image digitized, information about the scanner used, etc.)
-Information about the quality and veracity of an image (for example, whether it is a high-quality image done by a museum or an image digitized by an individual).
-Information about the original object (nature and origin, legacy content metadata, etc.)
-Information about an image's authenticity (cryptographic techniques or digital signatures)
-Information about rights management (viewing/reproduction restrictions, contact info, etc.)

It was very interesting reading these over and gaining more insight into what might be beneficial to include in the fields for a digital image repository. I have included many of these elements in my test collection, but there is some information I don't have, such as details about the scanner I used and information on the images' authenticity (which I think is beyond me for this class project). I am an art enthusiast and would love to work in a museum or similar setting, so I am enjoying learning more about the ways that image repositories in particular are set up and used.