Installation of Python-FedoraCommons-Webarchive on CentOS

Before installing this on CentOS we installed it on Ubuntu, Please note that the Ubuntu install took considerably less time and worked perfectly. For Ubuntu install instructions see this page

1 Setup Linux Server.

Using CentOS4, hosted on a VMware virtual machine.

Note that the user name I will be using in this guide is simply ‘user’ and I will refer to this as either ‘user’ or ‘username’. Replace this with whatever the username was that you chose during installation.

1.1 Create default user

Get a root prompt on a command line:


[root@server]cd /usr/sbin
[root@server]./useradd -m -d /home/user -p password username

1.2 Python2.5 installation

Ensure you have Python2.5 installed.

We used an older version of CentOS that did not have Python2.5 installed. If Python2.5 did not come installed with the os you are using, install Python2.5 from source. Make sure that you first install zlib so that you are able to configure it when installing Python.

(Optional) Download zlib from here
user@server # tar -zxvf zlib-1.2.3.tar.gz
user@server #./configure
user@server # make
user@server # su
root@server # make install

Download python2.5 from the following URL:

Install Python2.5

user@server # cd /home/user
user@server # cd python-2.5.2
user@server # ./configure --with-zlib=/usr/local/include/ (this is the location of the zlib.h file)
user@server # make
user@server # su
root@server # make install

2 Set up networking and firewalls

2.1 IPtables

Edit the iptables file

[root@server] # vi /etc/sysconfig/iptables

Add the following lines:

-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 8080 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 5000 -j ACCEPT

2.2 Proxy

[root@server] # export HTTP_PROXY=hostname:PORT

3 Install all updates and get the basic applications we will need

To ensure that your machine is up-to-date, run the following:

[root@server]# yum update
[..... lots of lines of stuff ....]
Total download size: 12 M
Is this ok [y/N]: y

Hopefully, once those are installed, your machine will be up to date. Now to install on all the necessary packages:

3.1 Initial packages

[root@server]# yum install gcc
[root@server]# yum install python-devel
[root@server]# yum install mysql-server
[root@server]# yum install openssh-server

3.2 Install easy install

Download the script from the following URL:

Install the script by running it as root:

[root@server]# wget
[root@server]# python

If you receive the following error you may have to yum install zlib (see above).

 File "", line 267, in <module>
  File "", line 200, in main
    from setuptools.command.easy_install import main
zipimport.ZipImportError: can't decompress data; zlib not available

3.3 Java

To install sun-java5-jdk , locate and download binary file for jdk via the following link:

Place file jdk-1_5_0_15-linux-i586.bin in /usr/lib/jvm directory

[root@server]# chmod 755 jdk-1_5_0_15-linux-i586.bin
[root@server]# ./jdk-1_5_0_15-linux-i586.bin

Agree to licence that displays

Set system path to locate the java at: /usr/lib/jvm/jdk1.5.0_15

3.4 Install mysql-python

[root@server] yum install mysql-devel
[root@server] yum install zlib-devel

Download appropriate egg file at the following URL:

We downloaded MySQL_python-1.2.2-py2.5-win32.egg

[root@server]# easy_install MySQL_python-1.2.2-py2.5-win32.egg 
[root@server]# yum install sqlite-devel
[root@server]# easy_install pysqlite
[root@server]# easy_install rdflib==2.4.0
[root@server]# easy_install pylons
[root@server]# easy_install uuid
[root@server]# easy_install beautifulsoup
[root@server]# easy_install elementtree

4 Install some more python libraries

So, we need to install some python libraries for later, iCalendar format (vobject), OpenID consumer library (python-openid), and also install other miscellaneous things, such as a library that can generate UUIDs and a very good web framework called Pylons:

[root@server]# easy_install python-openid
[root@server]# easy_install uuid

Download python-dateutil-1.4.tar.bz2 from

[root@server]# tar -xvjf python-dateutil-1.4.tar.bz2
[root@server]# python2.5 install
[root@server]# easy_install vobject
[root@server]# easy_install pylons

5 Get Fedora-Commons and Apache Solr.

Download the Fedora and Solr packages:

[root@server]# su user
[user@server]$ cd /home/user
[user@server]$ wget
[user@server]$ wget

6 Make the server environment ready for Fedora Commons

If you now list the home directory, you should see something like this:

[user@server]:~$ ls
apache-solr-1.2.0.tgz  fedora-3.0b1-installer.jar

We will need the following:

6.1 A directory to store Fedora’s root directory (config files, logs, libraries, and default Tomcat instance)

I chose to store the Fedora root directory at /opt/fedora30b1 –

[user@server]$ sudo -s
[root@server]# mkdir /opt/fedora30b1

Let the user own it: (Remember change ‘user’ to whatever your user is actually called!)

[root@server]# chown user:user /opt/fedora30b1

(Optional) And to aid upgrading, create a symlink at /opt/fedora to this folder:

[root@server]# ln -s /opt/fedora30b1 /opt/fedora
[root@server]# chown repuser:repuser -h /opt/fedora

Fedora needs certain environment variables to be set up now, FEDORA_HOME and JAVA_HOME at the very least. Open up the system wide profile (/etc/profile) and add them in there.

[root@server]# vi /etc/profile

And add the following lines to the end of the file (also, note that there *must not* be any gaps either side of the ‘=’ character)

# If you did not create the symlink, just point directly at your Fedora root
# or if you did do the 'ln -s ...' step, use this instead:
# If you did not create the symlink, just point directly at your tomcat root
# or if you did do the 'ln -s ...' step, use this instead:
export JAVA_HOME

Save the file

Now, to check that this has worked, type the command ‘exit’ a few times to logout and then log back in again as your default user. If things have worked well, the following commands should work:

[user@server]$ echo $FEDORA_HOME

[Or ‘/opt/fedora’ depending on what you chose.]

[user@server]$ echo $JAVA_HOME

6.2 A mysql database and account for Fedora to use

Now to sort out MySQL. Remember that default root password you set for MySQL? You’ll need it now.

[user@server]$ mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 10
Server version: 5.0.45-Debian_1ubuntu3.1-log Debian etch distribution

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.


Now issue the following commands:

mysql> create database fedora30;
Query OK, 1 row affected (0.00 sec)

mysql> grant all on fedora30.* to 'fedoraAdmin'@'localhost' identified by 'PUTYOURPASSWORDHERE';
Query OK, 0 rows affected (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

Query OK, 1 row affected (0.00 sec)

mysql> ALTER DATABASE fedora30 DEFAULT COLLATE utf8_bin;
Query OK, 1 row affected (0.00 sec)

mysql> exit


(NB You may or may not need to add the utf-8 configuration lines for your particular version of MySQL, but as far as I know, the commands are harmless if you don’t need them and utterly crucial if you do. Well, crucial unless you are dealing purely with ascii, but could you really guarantee that?)

7 Install Fedora commons 3.0b1

(Note – Official installation guide is here)

Go to the location where you saved the fedora installer, probably the user’s home directory and run the installer. I’ll include the entire installation dialog here. Where the response is blank, I simply pressed enter to accept the default.

[user@server]$ cd /home/user
[user@server]$ java -jar fedora-3.0b1-installer.jar

Fedora Installation

To install Fedora, please answer the following questions.
Enter CANCEL at any time to abort the installation.
Detailed installation instructions are available at:

Installation type
The 'quick' install is designed to get you up and running with Fedora
as quickly and easily as possible. It will install Tomcat and an
embedded version of the McKoi database. SSL support and XACML policy
enforcement will be disabled.
For more options, including the choice of hostname, ports, security,
and databases, select 'custom'.
To install only the Fedora client software, enter 'client'.

Options : quick, custom, client

Enter a value ==> custom

Fedora home directory
This is the base directory for Fedora scripts, configuration files, etc.
Enter the full path where you want to install these files.

Enter a value [default is /opt/fedora] ==>

Fedora administrator password
Enter the password to use for the Fedora administrator (fedoraAdmin) account.


Fedora server host
The host Fedora will be running on.
If a hostname (e.g. is supplied, a lookup will be
performed and the IP address of the host (not the host name) will be used
in the default Fedora XACML policies.

Enter a value [default is localhost] ==>

Authentication requirement for API-A
Fedora's management (API-M) interface always requires user authentication.
Require user authentication for Fedora's access (API-A) interface?

Options : true, false

Enter a value [default is false] ==>

SSL availability
Should Fedora be available via SSL? Note: this does not preclude
regular HTTP access; it just indicates that it should be possible for
Fedora to be accessed over SSL.

Options : true, false

Enter a value [default is true] ==>

SSL required for API-A
Should API-A be accessible exclusively via SSL? If true, requests
to access API-A URLs will be automatically redirected to the secure port.

Options : true, false

Enter a value [default is false] ==>

SSL required for API-M
Should API-M be accessible exclusively via SSL? If true, requests
to access API-M URLs will be automatically redirected to the secure port.

Options : true, false

Enter a value [default is true] ==> false

Servlet engine
Which servlet engine will Fedora be running in?
Enter 'included' to use the bundled Tomcat 5.5.23 server.
To use your own, existing installation of Tomcat, enter 'existingTomcat'.
Enter 'other' to use a different servlet container.

Options : included, existingTomcat, other

Enter a value [default is included] ==> included

Tomcat home directory
Please provide the full path to your existing Tomcat installation, or
the path where you plan to install the bundled Tomcat.

Enter a value [default is /opt/fedora/tomcat] ==>

Tomcat HTTP port
Which HTTP port (non-SSL) should Tomcat listen on? This can be changed
later in Tomcat's server.xml file.

Enter a value [default is 8080] ==>

Tomcat shutdown port
Which port should Tomcat use for shutting down? Make sure this doesn't
conflict with an existing service. This can be changed later in Tomcat's
server.xml file.

Enter a value [default is 8005] ==>

Tomcat Secure HTTP port
Which port (SSL) should Tomcat listen on? This can be changed
later in Tomcat's server.xml file.

Enter a value [default is 8443] ==>

Keystore file
For SSL support, Tomcat requires a keystore file.
If the keystore file is located in the default location expected by
Tomcat (a file named .keystore in the user home directory under which
Tomcat is running), enter 'default'.
Otherwise, please enter the full path to your keystore file, or, enter
'included' to use the the sample, self-signed certificate) provided by
the installer.
For more information about the keystore file, please consult:

Enter a value ==> included

Policy enforcement enabled
Should XACML policy enforcement be enabled? Note: This will put a set of
default security policies in play for your Fedora server.

Options : true, false

Enter a value [default is true] ==> false

Enable Resource Index
Enable the Resource Index?

Options : true, false

Enter a value [default is false] ==> true

Enable the REST-API? The REST-API is an EXPERIMENTAL feature that exposes
the Fedora API with a REST-style interface. In particular, URL endpoints
should not be considered final, nor has policy enforcement been evaluated.
For more information about the REST-API, see

Options : true, false

Enter a value [default is false] ==> true

Please select the database you will be using with
Fedora. The supported databases are McKoi, MySQL, Oracle and Postgres.
If you do not have a database ready for use by Fedora or would prefer to
use the embedded version of McKoi bundled with Fedora, enter 'included'.

Options : mckoi, mysql, oracle, postgresql, included

Enter a value ==> mysql

MySQL JDBC driver
You may either use the included JDBC driver or your own copy.
Enter 'included' to use the included JDBC driver, or, enter the location
(full path) of the driver.

Enter a value [default is included] ==>

Database username
Enter the database username Fedora will use to connect to the Fedora database.

Enter a value ==> fedoraAdmin

Database password
Enter the database password Fedora will use to connect to the Fedora database.


Please enter the JDBC URL.

Enter a value [default is jdbc:mysql://localhost/fedora30?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true] ==>

JDBC DriverClass
Please enter the JDBC driver class.

Enter a value [default is com.mysql.jdbc.Driver] ==>

Successfully connected to MySQL
Deploy local services and demos
Several sample back-end services are included with this distribution.
These are required if you want to use the demonstration objects.
If you'd like these to be automatically deployed, enter 'true'.
Otherwise, the installer will put the files in your FEDORA_HOME/install
directory in case you want to deploy them later.

Options : true, false

Enter a value [default is true] ==>

Preparing FEDORA_HOME...
Configuring fedora.fcfg
Installing beSecurity
Installing Tomcat...
Preparing fedora.war...
Processing web.xml
Deploying fedora.war...
Deploying fop.war...
Deploying imagemanip.war...
Deploying saxon.war...
Deploying fedora-demo.war...
Installation complete.

Before starting Fedora, please ensure that any required environment
variables are correctly defined
For more information, please consult the Installation & Configuration
Guide, located online at or locally at

And that should merrily go away and install and setup Fedora and the bundled Tomcat server for you. Unlike other services you may install, this won’t start the Fedora service, nor will it create a handy startup/shutdown script that integrates with you linux startup scripts in /etc/init.d. We will create one later on.

8 Further configuration of Fedora 3.0

!IMPORTANT! Fix the broken ‘mail.jar’ library! (Broken, as in the REST api will not work correctly with the version release in 3.0b1)

Get it from here: and use it to replace the mail.jar found in $FEDORA_HOME/tomcat/webapps/fedora/WEB-INF/lib/mail.jar. Restart Tomcat if you need to.

If you can’t find $FEDORA_HOME/tomcat/webapps/fedora/ go to http://localhost:8080/fedora to run the fedora.war file which creates the directory.

I am keen on UUIDs, and I cannot see a good reason for not using them. I suggest using the fedora id ‘namespace’ of uuid, so that a fedora URI will look like


It is also trivial to generate these in python, consider the following code:

[user@server]$ python
Python 2.5.1 (r251:54863, Oct 5 2007, 13:36:32)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from uuid import uuid4
>>> uuid4().urn[4:]

To get Fedora to accept these though, the ‘uuid’ namespace needs to be added to the retainPID region in fedora’s configuration file.

[user@server]$ vi /opt/fedora/server/config/fedora.fcfg

Press Ctrl-W and search for retainPID. Add in uuid to the list of namespaces (the ordering is not important):

<param name="retainPIDs" value="demo uuid test changeme ...

9 Installing Solr

[NB you will only have to follow the guide below, but here are the official docs, should you get in trouble – Basic installation – Tomcat specific things to bear in mind]

Extract the whole archive somewhere on disc and you will see something like this in the apache-solr-1.2 folder:

~/apache-solr-1.2.0$ lsbuild.xml CHANGES.txt dist docs example KEYS.txt lib LICENSE.txt NOTICE.txt README.txt src

~/apache-solr-1.2.0$ ls distapache-solr-1.2.0.jar apache-solr-1.2.0.war

The easiest thing is to install Solr straight into the instance of Tomcat that Fedora has installed. One thing to be aware of is that search applications eat RAM and Heap for breakfast, so make sure you install it onto a server with plenty of RAM and it would be wise to increase the amount of Heap space available to the Tomcat instance. This can be done by making sure that the environment variable CATALINA_OPTS is set to “-Xmx512m”. This can be done inside the script in your /opt/fedora/tomcat/bin directory.

[i.e. just add CATALINA_OPTS=”-Xmx512m” at the beginning of the file if it doesn’t already exist.]

One final bit of advice before I point you at the rather good installation docs is that you might want to rename the .war file to match with the URL pathname you desire, as the guide relies on Tomcat automatically unpacking the archive:

So, a war called “apache-solr-1.2.0.war” will result in the final app being accessible at http://tomcat-hostname:8080/apache-solr-1.2.0/. We will rename ours when we copy it into Tomcat’s webapps directory.

Finally, Solr needs a place to keep its configuration files and its indexes. The indexes themselves have the capability to get huge (1Gb is not unheard of) and need somewhere to be stored. The documentation linked to below will refer to this location as ‘your solr home’ so it would be wise to make sure that this location has the space to expand. (NB this is not the directory inside Tomcat where the application was unbundled.)

So, let’s create a solr home in /opt as we did for fedora (NB change user):

[user@server]$ sudo -s
[root@server]# mkdir /opt/solr
[root@server]# chown user:user /opt/solr

Place the solr.war into Fedora’s Tomcat instance:

[root@server]# exit
[user@server]$ pwd
[user@server]$ cp dist/apache-solr-1.2.0.war $CATALINA_HOME/webapps/solr.war

Finally, we have to make sure a variable is available in Tomcat’s environment; the location of the Solr home directory. Remember that CATALINA_OPTS line we added before? Amend that now to look like:

(E.g. via vi $CATALINA_HOME/bin/ )

CATALINA_OPTS=”-Xmx512m -Dsolr.solr.home=/opt/solr”

Now, as we will shape the Solr search service later on (i.e. choosing the fields to be indexed, and how to index them for faceted searching) we will just copy across the basic solr example, to make sure everything is running fine.

[Make sure you are in the unpacked solr directory:]

[user@server]$ pwd
[user@server]$ cp -a example/solr/* /opt/solr
[user@server]$ ls /opt/solr
bin conf README.txt

10 Adding HTTP authentication to Solr update

First add a username/password to tomcat/conf/tomcat-users.xml:

<user username="solradmin" password="XXXXXXXX" roles="solradmin">

Then, in your Solr context, in tomcat/webapps/solr/WEB-INF/web.xml, add the following:


.... usual stuff ....

<!-- Define the Login Configuration for this Application -->
<realm-name>Auth needed</realm-name>


NB BASIC authentication sends the password over by plain-text, so this isn’t too great but is suitable for a localhost updater. Change this to DIGEST to increase the security, but bear in mind you may need to set the Realm for the Tomcat container and Digest hash mechanism (SHA1, MD5, etc)

(Some good guides to securing Tomcat services are but a Google search away – for example: )

11 Test your foundation

Now, we need to start up Fedora, and hopefully, it will all go smoothly:

[user@server]$ cd /opt/fedora/tomcat/bin/
[user@server]$ ./
Using CATALINA_BASE: /opt/fedora/tomcat
Using CATALINA_HOME: /opt/fedora/tomcat
Using CATALINA_TMPDIR: /opt/fedora/tomcat/temp
Using JRE_HOME: /usr/lib/jvm/java-1.5.0-sun

Now try these links:

http://localhost:8080/fedora/describe – make sure ‘uuid’ is one of the retainPIDs
http://localhost:8080/solr/admin Should look like a whole heap of options and bells and whistles.

Any 404 or 500 Server errors means that something has come unstuck.

12 Python-FedoraCommons-Webarchive

12.1 Installation

12.1.1 Download tar file


Download and unarchive the archive_vXX.tar.gz

12.1.2 Schema file

If you are planning on using simple dublin core metadata as your core metadata vocabulary, there is a ‘default_solr_schema.xml’ in the archive root directory of the archive tar file.

The schema.xml will create fields for the default 14 terms.

The fields are directly linked to the dublin core names, e.g. dc:title -> ‘title’ field in Solr. Also, all fields have an additional facet field, which has a prefix of f – therefore, ‘title’ is the field to search, but ‘f_title’ is the field to draw facets from.

Restart Solr with its new schema.xml.

12.1.3 Edit Search Terms

user@server #  cd into archive/lib/ folder.

Edit the ‘’ file to reflect the fields present in Solr.

This is currently a static hand-edited list, but it is a library function as I intend to ‘auto-magically’ derive the fields from the Solr instance itself, by downloading it’s schema, and parsing it.

12.1.4 Change access settings and passwords

Open lib/ and lib/ for editing.

Change the URL and credentials for the Fedora Commons service and the Solr service present at the top of this file. (If your Solr service needs no password, just use ” strings)

Also, look for the setting of self.root. This is the host of your application, so change it from ‘http://localhost:5000/&#8217; to whatever your server location is.

12.1.5 Start serving the application

Locate development.ini at the root of the archive directory

user@server# paster serve --reload development.ini

This should start serving the application, by using the settings present in development.ini (which has debugging turned on by default) and uses the –reload flag, which will quietly and automatically reload any pages, controllers or templates that it detects to have changed.

Should you want to change the port at which this application runs at, the .ini file contains a property near the beginning of the file, reading ‘port: 5000’. Change this to whichever port is needed, but bear in mind that low ports, such as port 80 and 443, are restricted and the server would need to by run with elevated permissions.

Need to make it https? Create or get a ssl.pem file (as you might for an Apache server) and add this to the root of the web application. Add into the .ini file, just under the ‘port’ declaration – ‘ssl_pem: name_of_pem_file.pem’

13 Harvest from Eprints

  1. Got to archive/helpful_scripts
  2. copy to archive/lib
  3. Modify fedora and Solr passwords in
  4. Set http_proxy
  5. Run python script
user@server# python EprintsURL/cgi/export_all?format=ResMapUrls

If script fails with following error: import error cannot import name xpath try:

 root@server# easy_install pyxml
  1. Once the script has harvested items into Fedora, run scan items script found in the lib directory:
 user@server# python

One thought on “Installation of Python-FedoraCommons-Webarchive on CentOS

  1. Pingback: Installer SVN et Trac sur un serveur dédié | LudiBlog

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s