Friday 18 December 2009

Virtually JODConverter II

Continued from Part One

Customization


All the hard work has been done by good people in the open source community, so all we do is plug some bits together. Simple!


The bits:


1) JODCoverter as Web app.

2) OpenOffice3 as Debian packages. (We *could* use apt-get but Turnkey is based on Ubuntu 8.04LTS and as far as I can tell doesn't include OO3).

3) An OpenOffice statup script. (there are others out there too, or make your own using /etc/init.d/skeleton).


Having downloaded these files, you need to get them to JOD1. You could attach JOD1 to the world (via NAT on NIC3 or something) and download them directly, but I preferred to keep JOD1 as clean as possible - either that or I like making work for myself - and so I downloaded the files onto MON1 and SCP'd them over to JOD1.


With the bits to hand, the steps are (all on JOD1 & as root):


1) Install the JODConverter WAR in JOD1's webapps directory:

- Unpack JODConverter and find the ".war" file. If you want, rename it to something else - this will be the path in the URL to JODConverter so I kept it simple and called it jodconv2.war.

- Move jodconv2.war to /var/lib/tomcat5.5/webapps/
(You could also "deploy" the WAR file via the Tomcat Web Admin interface).

Now, by default Turnkey runs Tomcat (wisely) with a SecurityManager running. This limits the things servlets can do. If you restart Tomcat now (/etc/init.d/tomcat5.5 restart) and visit /jodconv2/ you'll probably find it isn't running. This puzzled me for a while, but turns out the SecurityManager is to blame.
I tried to grant several (individual) permissions to JODConverter but to no avail so in the end gave it blanket rights by adding:

grant codeBase "file:/var/lib/tomcat5.5/webapps/jodconv2/WEB-INF/-"
{

permission java.security.AllPermission;
};


to the file:


/etc/tomcat5.5/policy.d/04webapps.policy


Restart Tomcat and with a bit of luck you'll get the jodconv2 page - a clean form that suggests you upload a document.

Try it and it fails! Why? Because you need to do step 2!


2) Install OpenOffice 3
- Unpack the OpenOffice tar.gz. Inside there is a directory /DEBS/. I'm sure they're not all needed, but rather than work out which ones to keep and which not, I installed them all (might want to revisit that one day). Install the contents of DEBS
-
cd [OO_DIR]/DEBS/
- ls *.deb | xargs dpkg -i

3) Install the init.d script - there are instructions at the link above, but in short:
- Cut and paste the script into a new file: /etc/init.d/soffice
- Edit the script to point to the right place for OO:

OOo_HOME=/usr/bin

to


OOo_HOME=/opt/openoffice.org3/program/


And, in theory, you're done!


Trying it out!
Shutdown and restart JOD1 and, using MON1 connect to:


http://yourIPofJOD1/jodconv2/


Upload a sample document and hopefully you'll get a nice fancy PDF back.

If you don't, I suspect it is because I missed some vital step along the way! Sorry about that!


Feel free to comment or email if you need a hand! :-)


Packaging the appliance
So we're done right?
Well, mostly. But what if we now want to deploy this appliance? Don't we need it neatly wrapped up and ready to roll?

Yep. I guess we do!

Since we didn't do anything "static" to JOD1 (set the IP for example) it is fairly simple to export it in OVF and, it can then be imported into any virtualization system and run just fine assuming it is connected to a real or virtual network with a DHCP server.

You may also want to create an ISO of JOD1 so it can be deployed by simple installation. This is made pretty easy with TKLPatch - a set of scripts that automate the process of creating an ISO from a OS patch.

The patch I created looks like this:

jodconv2/debs/*openoffice*.deb - ie. all the OpenOffice debs
jodconv2/overlay/
contains the following:


|-- overlay
| |-- etc
| | |-- init.d
| | | `-- soffice
| | `-- tomcat5.5
| | `-- policy.d
| | `-- 04webapps.policy
| `-- var
| `-- lib
| `-- tomcat5.5
| `-- webapps
| `-- jodconv2.war

and finally


jodconv2/conf is a simple one liner:

update-rc.d soffice defaults


Armed with those details and the TKLPatch guide you should have no worries making an ISO of a JODConverter appliance. However, there are a few caveats with TKLPatch.


Firstly, you might notice that Turnkey (and I guess Ubuntu) spread Tomcat all over the OS
- in /etc/ in /var/, etc. If you make a change (say to the policy files) and want to put that in the overlay, be sure you put the path to the *real* file rather than the symbolic path (ie. /etc/tomcat5.5 rather than /var/lib/tomcat5.5/conf).

Secondly, build the patch and create the ISO on a Turnkey Linux machine - I used JOD1 in the end. (This is mentioned in the TKL support forum).

Well, that is one long post! Sorry about that! I hope someone will find it useful one day. I
suspect I will in the New Year when I've forgotten just where this jodconv2 VM came from! :-)

Thursday 17 December 2009

Virtually JODConverter I

Virtual appliances are all the rage these days. No, I'm not talking about cookers that never turn up or washing machines in SecondLife (with thanks to Pete J & apologies for the content of SecondLife!), but rather small, self-contained, often single function, virtual servers. There are a load of them made available by Turnkey Linux, who take the Long Term Support edition of Ubuntu Server (Hardy), bolt on some extra software - the Apache Web Server for example - and ship the whole thing in a ready to run package, adding a rather natty Web-based admin client on the way.

Why bother? Well, we've committed to a virtual architecture and one of the things we gain is the ability to add and remove appliances as the need arises - meeting the changing needs of the Digital Asset Management System at any point - and so having a few appliances we can throw up at the drop of the hat (someone phones and says "I need to do a huge deposit of items and I need to do it yesterday, can you handle the extra load?") will be very useful. (There are other gains too - mostly the consolidation of space and energy use - you'll find lots on all
that out there in Web land!)

That said, you might just want to run JODConverter on your desktop machine. If you do, this'll help too. Just make the virtual appliance and run it on your desktop and use NAT & port mapping to connect to it. Voila! You're own personal copy of JODCoverter as Web service! :-)


Back in 2008 on the Google Code home of JODConverter some folks seem to have suggested a virtual appliance with JODConverter & OpenOffice would be a Good Thing(tm) :-). About a week ago, quite independently, we also decided it would be a Good Thing(tm) and I set about making it and, inspite of this preamble, is what I really wanted to write about! :-)


So, here are the simple steps I took to make a JODCoverter Virtual Appliance. Note that I used JODConverter 2.2.2, which seems more stable than 3 at the moment.

Preliminaries


1) Get yourself some virtualization software - I use VirtualBox.


2) (Optional, but I'll assume you did) Create, or reuse, a regular desktop VM (I used a standard Xubuntu install) - MON1 - and attach it to an internal (virtual) network on NIC0 (and useful to attach another NIC to the world via NAT too). Also add to this folder share with the host (your desktop PC). This will be handy later for moving isos, patches, etc. into and out of the virtual world.


3) Create a new (small) VM and install a Turnkey appliance of choice - I'll call this JOD1.
I used Turnkey's Tomcat but you might be trying to do something different. :-) I opted for a small and simple configuration (512MB RAM, 1 x 2GB disk and nothing fancy). Remember that appliances don't have fancy "desktops" so graphics capability isn't really a requirement! :-)

4) (Optional, see 2) Attach JOD1 to the same internal network as MON1 (the VM created in step 3).
We do this so that you can check open ports on JOD1, test if JODConverter is running, OpenOffice service is up, etc.


5) Start both machines.

You should now have a running Tomcat VM & a method of seeing it - open a browser in the test machine and try JOD1's IP (port 80). You should see the Web admin interface & if you don't check all the network connections and that JOD1 started OK, and such.


Now would probably be a good time to change the Web admin password!


Continued

Antagonistic Books

I love Instructables, and since it is nearly Christmas I thought I'd mention it. Not just because its great, but also because these appeared this last few weeks:

http://www.instructables.com/id/ANTAGONISTIC-BOOKS-Danger-How-To-make-a-book-th/

and

http://www.instructables.com/id/ANTAGONISTIC-BOOKS-Curiosity-How-To-make-a-book/

Just something else to watch out for in archives I guess! (I can't wait until the second one turns up only someone turned the ratchet around so it'll only close...)

There is something about them both that reminds me of William Gibson's Agrippa (a book of the dead) and that is something that has puzzled some of the archivists I've spoken with! :-)

Wednesday 9 December 2009

Open Development: Building an Engaged Community

I had an interesting day on Monday at the OSS Watch workshop Open Development: Building an Engaged Community (the slides are available from there).

The days aims were
  • Understand how open development works and know the common community structures
  • Be familiar with the skills and processes that encourage community participation
  • Develop ideas for improving the community friendliness of a specific project
I've been using open source software all my career but I've always been 1) a bit sketchy about how and 2) very nervous of getting involved in any open source software development so this sounded perfect!

From the offset Steve Lee of OSS Watch, in his introduction to the day, made it was clear that open development practice was key to open source software rather than simply access to the source code. (A couple of the presenters said "don't just throw it over the wall", referring to the practice of putting your source code some where public and walking away - a very common practice in our field - as this would not lead to a sustainable software product).

The rest of the day supported this ideal... Sebastian Brännström of the Symbian Foundation spoke of how Symbian hoped to make as much of the Symbian operating system (for phones) open source as soon as possible, and outlined the large (and quite formal) organisational structure required to support the 40 million lines of code. For a software project that large, this shouldn't come as a surprise, but clearly shows that "open sourcing" (I mean, the process to make software open source rather than sourcing work in an open way - though both are valid!) might not always be cheap or a free (beer) option. Indeed, he hoped that there would be a full-time, paid, community leader whose sole role would be to maintain and manage one of the 134 software packages that make up the Symbian OS.

Next up, Sander van der Waal of OSS Watch took us through the developer experience of taking part in an open source project - both from being part of a commercial company in the Netherlands and also working on the OSS Watch project SIMAL. It was very interesting to hear how his team had gone about contributing to Apache Felix & Jackrabbit (Two products very much of interest to our community!). He suggested it was very important to make use of the usual cluster of open source development tools - not just version management, but also mailing lists, bug tracking systems, wikis and the like - and that this was important if you were a "one man band" developer or a whole team. In many ways his experience here helped ease my nerves of contributing to projects.

The final speaker was Mark Johnson, of Taunton's College, giving his experiences and tips on being involved with the open source course management system, Moodle. In a past life I've developed for Moodle, so this was interesting to hear about. His advice was broadly similar to that of the other two speakers, though from a different perspective and here there was evidence of useful reinforcement of ideas rather than repetition, which is always a good thing.

A workshop isn't complete without a bit of group work and we were asked to complete a questionnaire designed, I think, to get us thinking about the sustainability of our open source projects by highlighting areas we should be considering - licensing, use of standards, documentation, etc.

This was a very useful tool and the questions got me thinking about all sorts of things. The results for futureArch were bad - all "red" (for danger) expect the section of use of standards - but that didn't come as much of a surprise. I think it would be fair to say that futureArch isn't an "Open Source Project" per se. Rather we're avid users of open source software. We, like many, do not have the resources to run a community around anything we build (who has funding for a full-time community manager?) and it would probably be inappropriate to try. But we can and will contribute to other projects and the workshop helped me see that this was both pretty easy (assuming everyone is nice) and desirable.

And, of course, anything we build here - the ingest tool for example or the metadata manager - will probably be "thrown over the wall" and people will be able to find it and others, if they get the urge, will be able to found a community, which I guess shows there is value in simple publication of source code in addition to the (far more preferable and more likely to succeed) development of a community around a product. (The revelation that community building is essential for a sustained software product, probably so obvious to many, sheds light on the reasons behind things like Dev8D too).

Just some final thought then as it grows ever darker and it is good not to cycle too late home!

Firstly, it struck me as people talked, that while open source could be seen as less formal than closed software development, it clearly is not. Development of communities and the subsequent control and management of those communities, requires formal structures making open source anything but an easy option.

Secondly, fascinating were the reasons given to contribute to an open source project. Someone mentioned how by taking part you felt you were not alone, but the overwhelming reason given was "recognition". By contributing you could get your name (and that of your employer) in lights, that participating in a community could lead to job offers, or other personal success. As most projects are on a meritocratic basis - the more good you do, the more say you have - that success could be to become the community leader or at least one of the controllers of the code - the fabled "commiters". This is a curious thing - the reason to participate in a "community" is the "selfish" urge to self-promote. Something jars there, but I'm not quite sure what.