TITLE

CMap Administration and Data Curation

Back to Top


VERSION

$Revision: 1.46 $

This document is intended to help you understand how to use the data curation tools provided with CMap. There are three tools you will use: the configuration files, the ``cmap_admin.pl'' command-line interface and the web-base administration interface.

The configuration file is used to create and customize the look of map, feature and evidence types as well as customizing the CMap experience.

The ``cmap_admin.pl'' program is used for all long-running processes that are not practical to address over the HTTP protocol. It employs a ``wizard''-like approach to accomplishing various tasks by asking the curator a series of questions and performing the desired action using the answers provided. cmap_admin.pl can also be driven via command line arguments.

The web admin tool is meant to provide a point-and-click interface for performing the more mundane administrative tasks or for viewing the data.

All of this is discussed in further detail in this document.

Back to Top


STARTING OUT

If you have just installed CMap, then your database is probably completely empty. (Of course, a small but complete dataset is provided in the ``data'' directory, but the database for your own data will be empty.) So let's start by discussing what you need to do to get a usable installation with your data.

This document assumes that you've already been through all the steps described in the INSTALL.pod document, so you should be able to pull up the web-based admin tool by pointing your browser to ``http://your.host.name/cgi-bin/cmap/admin'' (or where ever you have installed it). Depending on whether or not you decided to password-protect that URL, you may have to enter a username and password when prompted by your browser.

Back to Top


EDITING CONFIGURATION FILES

The first step to getting your data into CMap is to correctly set up your configuration files. The config files tell CMap how to access your database. Also, each map type, feature type and evidence type of the data that you will be installing needs to be defined in the configuration files.

CMap has multiple configuration files. These files are read out of the ``cmap.conf/'' directory, likely located in your apache ``conf/'' directory. There are two types of config files. There is one (and only one) ``global.conf'' file that holds information applicable to all of the databases. And in that directory, at least one individual configuration file for each database used. Everything that can be customized about CMap is controlled by these files, which are described in detail below.

The configuration files are written in a standard Apache-style configuration file syntax. Comments are defined as anything following a hash sign and are ignored. Options which are grouped together are in angle brackets (``<>''). Standard options are written in a ``Name Value'' syntax where the two are separated by whitespace or an equal sign. For more information on the syntax, read the POD for Config::General (by executing ``perldoc Config::General'' on your system).

Each configuration setting is documented in comments. Legal and default values are listed along with a brief description of the option. Except for the ``database'' and map/feature/evidence_type settings, there is a reasonable default, hard coded value for every configuration setting (so you could actually comment out all of the ``optional settings'' in the file).

Whenever you make a change to cmap.conf, it will most likely be necessary for you to purge the CMap cache to see the change if the change affects something in the viewer since. This is particularly true if map|feature|evidence type information has been changed because that information is cached along with database search results. Purging the query cache can be done by using the cmap_admin.pl program (explained later in this document).

global.conf

The global.conf contains information which is used by all of the other configurations. You will want to set the default_db value.

INDIVIDUAL CONFIGURATION FILES

CMap supports multiple data sources and multiple customizations for each. This allows a curator to maintain distinct databases and view them all in CMap.

If you have only one database, then you need only one configuration file of this type. Adding another is as simple as creating a new configuration file. Any file that is in the ``cmap.conf/'' directory that ends in ``.conf'' will be read as a configuration file.

All of the parameters that can be specified in an individual config file are described here.

The required elements are the Required General Options and the Map, Feature and Evidence Type Information. The other elements all have reasonable defaults and can be ignored until you want to tweak them.

The file ``example.conf'' (which is installed in the cmap.conf/ directory) provides a good starting point for your configuration. When you are done customizing your config file, make sure you set ``is_enabled'' to 1.

Required General Options

Map, Feature and Evidence Type Information

Starting with v0.13 map, feature and evidence types (*_types) are defined in the configuration files and not in the database. This means that to add an object of *_type, its type must be defined here in the config file.

The accession id of the *_type is listed in two places for technical reasons. It is important that both places have the same accession id. One is in the initial tag, <map_type X>, where 'X' is the accession. The other is in the 'evidence_type_acc' field.

It is important to note that some of the information is stored in the database when features and map sets are created. Fields like map units and default_rank will not change in the database if you change them only in the config file. This may be something to be addressed in the future.

Common fields.

Feature type specific options

Evidence type specific options

A single correspondence can be supported by any number of evidence types, so it is necessary that these be ranked from ``1'' to N (where lower numbers have precedence over higher). When multiple records support a correspondence, only the highest ranking evidence will be used when determining how to draw the correspondence line. If you intend to use cmap_admin.pl tool to create correspondences for you based on simple name/feature type comparisons, you should create a special evidence to use just for that purpose (e.g., with the name ``Automated name-based'' or something similar to flag it as a machine-created/non-curated correspondence). Evidence type color will only be used when NOT aggregating.

Map type specific options

Menu Defaults

Map Menu Options

Feature Menu Options

Correspondence Menu Options

Display Menu Options

Advanced Menu Options

Presentation Options

Web Related Options

JavaScript options

These options are only used to allow complex JavaScript to be specified in the config document. These can be used in conjunction with the area_code of feature_types and map_types.

Configuration Defined Buttons

CMap allows the creation of buttons to modify options in the CMap viewer menu. Conditions can be set that would need to be met before the button will appear.

A simple example would be to add a ``For Publication'' button that would set the ``clean_view'' option and would only appear when the ``clean_view'' option was off.

The impitus for adding this feature was for creating a button that would toggle between an overview style view and a more detailed view. Clicking the ``Overview'' button would set the feature display options to ``corr_only'' of all but a select set of feature types which would have the effect of de-cluttering the view. A ``Detailed'' button could set all the feature display options to always ``display''.

Configuration Structure

All buttons are enclosed in an <additional_buttons> tag.

The following is the configuration for the ``clean_view'' buttons mentioned above.

  <additional_buttons>
      <button>
          text For Publication
          <if>
              clean_view 0
          </if>
          <set>
              clean_view 1
          </set>
      </button>
      <button>
          text Return Navigation Buttons
          <if>
              clean_view 1
          </if>
          <set>
              clean_view 0
          </set>
      </button>
  </additional_buttons>

Note that there are two buttons, one for when ``clean_view'' is set and one for when it is not. They work together.

Each <button> is it's own element with the following fields.

Testable Parameters

- Boolean Parameters

These parameters can be tested using 1 or 0 (or blank).

    highlight
    collapse_features
    scale_maps
    stack_maps
    omit_area_boxes
    show_intraslot_corr
    split_agg_ev
    clean_view
    corrs_to_map
    ignore_image_map_sanity
    dotplot
- String Parameters

These parameters are a little more than boolean. They may be numbers (for isntance, the radio button`` aggregate'' values are 0,1 or 2). View page source of the CMap viewer to see the options of radio buttons.

    prev_ref_species_acc
    prev_ref_map_set_acc
    ref_species_acc
    ref_map_set_acc
    image_type
    label_features
    aggregate
    comp_menu_order
    data_source
    ref_map_start
    ref_map_stop
    font_size
    pixel_height
    dotplot_ps
- Feature and Evidence Type Parameters

These parameters are a little different. To query if a feature type is displayed, that feature type must be included as a display_feature_type.

    display_feature_type feature_type_acc
    corr_only_feature_type feature_type_acc
    ignored_feature_type feature_type_acc
    included_evidence_type evidence_type_acc
    ignored_evidence_type evidence_type_acc
    less_evidence_type evidence_type_acc
    greater_evidence_type evidence_type_acc

An example button:

    <button>
        text No Marker
        <if>
            display_feature_type marker
        </if>
        <set>
            ft_marker 0
        </set>
    </button>

Miscellaneous Options

Title and Intro Text Options

Titles and Introductory texts for various pages.

Administration Options

Map Set Creation Defaults

These are the default settings for when map sets are created.

INCLUDING OTHER CONFIGURATION FILES

If there is a section of the individual configuration files that is repeated in multiple files, that section can be placed in a separate file. That file can then be included in each configuration file.

 <<include foo_type_info.cfg>>

It is recommended that any included files be named with a ``.cfg'' extension. Do no name it with a ``.conf'' extension, otherwise it will be read as a separate config file.

VALIDATING CONFIGURATION FILES

You can use the cmap_validate_config.pl (VALIDATE CONFIG FILES) to test your individual config files (not global.conf yet) and make sure that they are valid. This can save you a lot of headaches dealing with configuration problems. See the section on VALIDATE CONFIG FILES for more information.

CONFIGURING FOR SPEED AND CLARITY

There are a few configuration options that can be used to increase the speed and clarity of CMap. This is a collection of those options. The details are described above.

All of these can be overridden by the user.

Back to Top


WEB-ADMIN TOOL

The home page for the web-based CMap administration tool has links for all the different actions you can take:

Each link is self-explanatory. In addition to the above links, you will also see the current data source to which you are connected. You may configure CMap to connect to different distinct data sources by having multiple configuration files; see the section EDITING CONFIGURATION FILES above for more information on this. If you have configured more than one data source, you will also see a drop-down control allowing you to connect to a different data source.

On each ``View'' page, you have access to edit and delete the information there (except for map, feature and evidence type information which are defined in the config file). Each page will be discussed in detail later. For the moment, let's just focus on what you need to import your first data set. To do that, you'll need to first set up the species.

Back to Top


CREATING AND IMPORTING DATA

IMPORTING GFF DATA

Starting with CMap version 1.01, CMap is able to import data in the GFF3 file format.

Using special CMap specific extensions, all the CMap data can be imported in one file. This allows a whole data base worth of CMap data to be imported in one step (rather than the multiple previously required which are described below).

Alternatively, regular GFF3 (as used by GBrowse) can be imported if a map set is specified.

The format is described in detail in the Bio::DB::SeqFeature::Store::cmap module. It can be accessed by running the command:

  $ perldoc Bio::DB::SeqFeature::Store::cmap

The GFF files can be imported using the cmap_admin.pl script, either using the menu system or by using the command line interface (examples follow).

The first example is using a gff file ``cmap_style.gff'' that has the CMap extensions to define which species and map sets the data belongs.

  $ cmap_admin.pl -d DATASOURCE --action import_gff cmap_style.gff

This second example defines the map set accession to import the data into. However, if a map set is defined in the file, the map set in the file will be used.

  $ cmap_admin.pl -d DATASOURCE --action import_gff --map_set_acc MAP_SET_ACCESSION gbrowse_style.gff

Note: In order for GFF importing to work correcly, it requires bioperl-live downloaded after June 7th, 2008.

EXAMPLE GFF IMPORT

A sample CMap GFF3 file, ``test_data.gff'', is located in the data directory of the distribution. This can be imported into the demo database (which can be created using the ``./Build demo'' command).

  $ cmap_admin.pl -d CMAP_DEMO --action import_gff data/test_data.gff

SETTING UP SPECIES

Click on ``View Species'' from the home page of the web-admin tool. As you currently have no species to view, you should see the message ``No species to show.'' To add a new species, click on the ``Create New Species'' link at the top of the page. You should now see a page entitled ``Species Create'' with a form for the following fields:

Note: All the accession id columns in the CMap tables act the same. They are all character fields, so they will accept any combination of numbers and letters you care to use. Please don't use spaces or characters outside the ranges ``a-z,'' ``A-Z,'' ``0-9'' or dashes (``-'') as this will likely only cause you headaches. It is also not necessary to explicitly assign any accession IDs. While they *are* required by the database, there is code in place to ensure that the accession ID is set to the primary ID of the record if the accession ID is empty. Once your accession IDs have been established and publicized, they should never change.

Also, it is best to avoid strictly numeric accession ids since the automatic accessions are numeric and this can cause conflicts.

The fields marked ``(opt.)'' do not require you to enter a value. If one is really required for the database (e.g., the ``accession_id''), then a reasonable default will be provided (e.g., the primary key value for accession IDs). When you are done, hit the ``Submit'' button. If there are errors, they will be reported to you and you will have to correct them before submitting again. If there are no errors, your entry will be accepted and you will be returned to the ``Species View'' page.

Note: On all the pages with clickable column headers, clicking on the column names will resort the data by the column.

If you are unhappy with any of the data you see, you can click on the ``Edit'' link of the record that displeases you and correct the faults. If you have created an unnecessary species, you can delete it by clicking on the ``Delete'' link. After confirming that you really wish to delete the species, it will be *permanently* removed from the database.

Note: There is no ``undo'' function for deletes, so be sure whenever you decide to remove an object from the database that you really mean it. When an object has other objects that rely on it (e.g., species records are linked to map sets), you will not be allowed to remove the object until all dependencies to it are removed (e.g., no more map sets use the species you want to remove).

Once you've set up all the species you wish to have in your database, click on the ``Home'' link in the upper-left corner to return to the web-admin home page.

Note: Throughout the admin interface, the ``create'' and ``edit'' pages for any object (e.g., species, map types, map sets, etc.) have the same fields in the same order with the same restriction. If this document only mentions the ``create'' or ``edit'' page for an object, rest assured that the complementary page works the same as the one described.

CREATING A NEW MAP SET

Once you have species set up and map types defined, you are ready to create a new map set. Everything in CMap is designed to be generic, so a ``map set'' is simply a collection of maps. What you group together is entirely up to you. On Gramene, map sets tend to correspond to published studies of organisms and contain maps that represent chromosomes, linkage groups, FPC contigs, and such. As the database design allows only one species and one map type to be linked to a map set, it is best to keep these narrowly defined.

USING THE WEB ADMIN TOOL TO CREATE A MAP SET

To create a new map set, click on the ``Create New Map Set'' link from the admin home page or from the ``Map Sets View'' page. If you have failed to set up species or map types, you will be prompted to do so before continuing. You will see the following fields:

USING THE cmap_admin.pl TOOL TO CREATE A MAP SET

You can also create a new map set by using the ``cmap_admin.pl'' tool. There are only a few functions that exists in both tools, and this one was only added to cmap_admin.pl in order to make importing new data sets more convenient. You can only specify the species, map type, long and short names, and accession ID for the map set. Everything else about the map must be edited using the web admin tool.

To create a new map set with cmap_admin.pl, start the script and choose the ``Create new map set'' option. Answer the questions appropriately and confirm your decision. If you see no errors, the map set was successfully created.

Note: When using cmap_admin.pl, questions which have only one choice are automatically answered by the tool. In the above example, if you had only one species in the database, cmap_admin.pl will not ask you which species to associate with the new map set. There can only be one answer, so it answers the question automatically.

VIEWING NEW MAP SET

Once you have successfully created a map set with the web admin tool, you will be taken to the view of that map set. You should see the data that you entered and that the set currently has no maps associated with it. You can either create each map for the map set individually or you can import the data for the map set. Most likely, you will want to do the latter, so that is the next section.

IMPORTING MAP DATA

Using the cmap_admin.pl, you can import a tab-delimited file containing the data for a map set with the following fields and data types:

For a more thorough treatment of these fields, read the ``import_tab'' section of ``perldoc Bio::GMOD::CMap::Admin::Import.''

The first line of the file should be the tab-separated names of the fields in whatever order you're supplying them (the order of fields is not important). You should use the above names for the fields, but you can use spaces and capitalization for the column names, if you like, as spaces will be converted to underscores and the names lowercased (e.g. ``Feature Alt Name'' will become ``feature_alt_name'').

If the fields ``map_start'' and ``map_stop'' are not supplied, then the start and stop positions of the map will be determined after importing all the features by selecting the MIN and MAX start and stop positions from the ``cmap_feature'' table.

Use the ``cmap_admin.pl'' script to bring in your correctly formatted data. Run the script, optionally passing your data file as an argument, like so:

  $ cmap_admin.pl my_groovy_maps.dat

If you pass a file as an argument, you will be asked to confirm that that is the file you want to use. If you answer ``no'' or did not pass a file, then you will be asked to locate the file containing your data. Type in the path to the file (noticing that you can use tab completion), or type ``q'' to exit file selection and return to the ``Main Menu.'' Once you've found your file, you'll need to tell the tool which map set the data corresponds to. First choose the map type and then the species of the map set, then the map set itself. Lastly, you will need to confirm your choices. If all goes well, you should see a lot of lines fly by giving you the step-by-step progress, the message ``Done,'' and then you will be returned to the ``Main Menu.'' A complete log of the actions taken by the script will be stored in a file called ``cmap_admin_log.X'' (where ``X'' is an incrementing number). See the docs on cmap_admin.pl (by typing ``perldoc cmap_admin.pl'') for more info.

When you import data for an map set that already has data, all existing maps and features with the same name as maps and features in your data will be updated. You also have the option of deleting any data that isn't updated. If you choose this ``overwrite'' option, any of the pre-existing maps or features that aren't updated will be deleted as it will be assumed that they are no longer present in the dataset.

While the import process is running, it may encounter feature types in your data which do not exist in the database. If this happens, the program will die right there and you will be left to the task of figuring out what was the last data inserted, defining the feature type in the config file and re-running the remaining data.

When cmap_admin.pl has successfully finished importing your map data, you will be returned to the ``Main Menu.'' From here, choose to ``Quit'' as the rest of the functionality will be covered later.

IMPORTING XML DATA (EXPERIMENTAL)

As of 0.10, CMap exports and imports data in XML format. The standard tab-delimited format is very convenient and easy to generate, but it's also difficult to indicate hierarchical relationships among data. As such, some experimental has been added to export the concept of ``objects'' from the database. These objects will contain all the information within them necessary to duplicate themselves entirely in another CMap database. This code is functional but still marked ``experimental'' as it appears to be very slow when exporting very large map sets.

To try this feature, choose ``Export data'' and then ``Database objects.'' Follow the directions from there. Then try importing the data (into another database, of course) using the ``Import data'' option and then ``Import CMap objects.'' Database objects are only created on import at this time; there is no updating of existing objects, so be careful!

MAKING CORRESPONDENCES

Once you have more than one set of maps in CMap, you can use it for what it was designed: showing comparisons. To do this, you must first create the correspondences between the features on the maps. Before you can do this, you'll need to establish the evidence types that can be used to support the correspondences. These are defined in the configuration file. For how to create evidence types, see EDITING CONFIGURATION FILES above.

CREATING CORRESPONDENCE RECORDS

Once you have evidence types, you can create the correspondence records. It might be helpful to you to inspect ``cmap-schema.png'' and ``cmap-schema-graph.png'' images in the ``docs'' directory that visualize the CMap tables and their relationships. (Also included is ``cmap-schema-desc.html,'' a breakdown of the tables into HTML tables for easy viewing.) The tables involved are:

For descriptions of these tables, look at the ``CODE_OVERVIEW.pod'' document in the ``docs'' directory.

There are three ways to create correspondences:

  1. Create each by hand using the web admin tool.

    To do this, locate a feature on a map by:

    A) ``View'' a map set, then ``view'' a map, then ``view'' the feature

    B) Click on ``Search for a Feature'' and find the feature by name

    Once you have located a feature and have navigated to the ``View Feature'' page, click on the link to ``Add Correspondence.'' This will take you to the ``Feature Correspondence Create'' page where you can search for the other feature in the relationship. When you have located both features, you will be presented a form with the following fields:

    When you have finished, click the button entitled ``Create Correspondence.'' If the correspondence is successfully created, you will be taken to the ``View Feature Correspondence'' page showing you the two features, their respective maps and types, and the evidences supporting this relationship. You can add more evidence types by clicking the link ``Add Evidence.'' You can also edit and delete existing evidences by clicking the appropriate links to the right of them. If you choose to ``Edit'' a correspondence, you will be presented a form with the following fields:

  2. Automatic name-based correspondences

    The second method for creating correspondences is to allow CMap to create a relationship between any two features with the exact same name, irrespective of case. To do this, start cmap_admin.pl and choose ``Make name-based correspondences.'' It is recommended that you define special evidence type to represent this automated correspondence. This method is likely to be both incomplete and overly-optimistic. For example, it will create correspondences between features called ``Centromere,'' which is probably not desirable. The name comparisons happen on both the ``feature_name'' in the ``cmap_feature'' table as well as any aliases from the ``cmap_feature_alias'' table. Every feature has each of these fields compared to those fields for every other feature, so this can take a long time to complete. For all of its problems, this can be a good way to get started, and you can pare down the correspondences from there.

    By default, CMap will only compare features of the same type when making name-based correspondences. You can expand the feature types considered by adding appropriate lines to ``cmap.conf.'' This is documented in that file; look for the string ``add_name_correspondence'' and follow the directions there.

  3. Import

    The third way to create correspondences is to import them. Simply create a tab-delimited file that lists the names (or accession IDs) of two features and some evidence to support the correspondence. For more information on the format of this file, execute the following on your system:

      perldoc Bio::GMOD::CMap::Admin::ImportCorrespondences

    This method is the surest as you are always directly controlling what gets created.

LOADING THE CORRESPONDENCE MATRIX

The data underlying the correspondence matrix is all precomputed. As it is an intensive operation on seldom-changing data, it was determined to cache the pair-wise comparisons of all the maps in the database into a (very denormalized) table that would subsequently optimize the many calls for this data. Because of this, it is necessary that you remember to reload the matrix whenever you alter the number of correspondences in the database. To do this, execute cmap_admin.pl and choose the ``Reload correspondence matrix'' option. The only option is to completely truncate the table and reload it from scratch.

ESTABLISHING DATABASE CROSS-REFERENCES

See the ``attributes-and-xrefs.pod'' document.

Back to Top


VIEWING DATA

Now that we've stepped through the basics of setting up the CMap data, let's go over some of the more basic operations of curating the data.

VIEWING ALL MAP SETS

From the home page of the web admin tool, you'll see that you can ``View Map Sets.'' Selecting this link takes you to the ``Map Sets View'' page where you'll notice three drop-down boxes that you can use to narrow the selection criteria for the map sets being displayed. You can restrict them by species, map type, and whether or not they are currently enabled. There is no automatic submission of the form when you make a choice (as you might want to use more than one criteria), so be sure to hit ``Submit'' when you've made your choices. As noted before, you can re-sort the data by clicking on the hyperlinked column headers. Every object in the web admin tool can be ``viewed,'' ``edited,'' and ``deleted'' by the respective links which are usually displayed to the right.

Note: On pages that may return a large record set, the data is ``paged'' and can be accessed by moving through the pages of the data. On all these pages, the total number of records is displayed along with which are currently being shown. This is the section that looks like ``52 records found. Showing 1 to 25.'' (The page size is determined by the ``max_child_elements'' option in the CMap configuration file which will be discussed shortly.)

VIEWING A MAP SET

Choose to ``view'' one of your map sets. You'll be taken to a page which lists all the data in the map set table as well as a summary of all the maps associated with the map set. Notice that you can change any of the data for the map set by clicking the ``edit'' link at the top, or you can delete it from here by choosing the ``delete'' link.

Note: Deleting an object that has dependencies will cause the dependencies to be deleted as well. So deleting a map will delete all the features (which will delete all the correspondences which will delete all the evidences [but not evidence types] supporting the correspondence). Deleting a map set deletes all the maps (which deletes all the features which ... you get the picture). Basically, just be very sure that you want to delete something as it can have cascading effects, and, as noted earlier, THERE IS NO UNDO. There is, however, the option to dump your data before messing with it exists, giving you the ability to recover. This is discussed in the POD documentation of cmap_admin.pl.

VIEWING A MAP

Choose to ``view'' a map to see a summary of all the data in the map table as well as all the features on the map. Notice that you can restrict the features displayed by their feature types. You can also search for a feature; the search option on the map view page automatically restricts the search to the current map being displayed.

EDITING A MAP

If you click on ``Edit'' while viewing a map, you will be taken to a page with the following fields:

VIEWING A FEATURE

Choose to ``view'' a feature from the map view page. You'll be presented with all the data stored in the feature table as well as the feature's aliases, correspondences to other features and the evidence types supporting the correspondences. You can also add a new correspondence by clicking ``Add Correspondence'' and following the directions discussed earlier in section 8.

EDITING A FEATURE

If you click on the ``Edit'' link from the feature view page, you'll be presented a form with the following fields:

SEARCHING FOR A FEATURE

Often you will be interested in finding an individual feature in the database without having to navigate to the map set, then the map, then page through until you find the feature. From the home page of the web admin tool, you can choose the ``Search for a Feature'' link. The ``Feature Search'' form gives you four fields:

Back to Top


CUSTOMIZING HTML TEMPLATES

All the HTML displayed by the application is contained in the templates. These templates are processed by Template Toolkit to produce the user interface. For the most part, the files contain straight-up HTML and can be altered to your heart's content. You could probably even pass off the care and feeding of these templates to a non-technical person (as this was the idea behind having no HTML in the code). The only functional parts of the templates lie in between the many ``[% %]'' tags, and these are often quite self-explanatory. If you can't figure out what to change on your own, then check out http://www.template-toolkit.com/ for the documentation (or type ``perldoc Template'' on your command line).

Back to Top


ADMINISTRATION USING cmap_admin.pl

The cmap_admin.pl script contains full documentation in POD format. To read, please execute ``perldoc <script_name>''.

That will describe how to use cmap_admin.pl using command line flags for scripting.

However, here is a description of the menu options that cmap_admin.pl provides.

  1. Change current data source

    Changes the data source currently being used. This choses which configuration file the script will use.

  2. Create new map set

    After a species has been created using the web admin tool, map sets can be created using cmap_admin.pl

  3. Import data

    Allows import of data from files. The files can be tab-delimited map data for an existing map set, tab-delimited correspondence data for existing features or the experimental CMap objects (which are created with the export function). To see more information about the tab-delimited format required, see the documentation that is in the modules by running ``perldoc /path/to/Bio/GMOD/CMap/Admin/Import.pm'' and ``perldoc /path/to/Bio/GMOD/CMap/Admin/ImportCorrespondences.pm''.

  4. Export data

    Allows export of data to a file. The data can be represented in a tab-delimited text file, as a series of sql insert statements or as the experimental CMap objects.

  5. Delete data

    Allows sweeping deletion of maps, map sets or correspondences.

  6. Make name-based correspondences

    Allows the creation of correspondences based on feature name similarities.

  7. Reload correspondence matrix

    Reloads the lookup table that the matrix uses. This is vital if the matrix is to have current data after a data change.

  8. Purge the cache to view new data

    Purge the query cache. The results of many queries are cached in an effort to reduce time querying the database for common queries. Purging the cache is important after the data has changed or after the configuration file has change. Otherwise the changes will not be consistently displayed.

    There are four layers of the cache. When one layer is purged all of the layers after it are purged.

    - Cache Level 1 Purge All

    Purge all when a map set or species has been added or modified. A change to map sets or species has potential to impact all of the data.

  9. - Cache Level 2 (purge map info on down)

    Level 2 is purged when map information is changed.

    - Cache Level 3 (purge feature info on down)

    Level 3 is purged when feature information is changed.

    - Cache Level 4 (purge correspondence info on down)

    Level 4 is purged when correspondence information is changed.

  10. Cache Level 5 (purge whole image caching )

    Level 5 is purged when any information changes because this level caches the whole CMap image when a map is first displayed.

  11. Delete duplicate correspondences

    If duplicate correspondences may have been added, this will remove them.

  12. Import links

    This option is where to import links that will show up in the ``Saved Links'' section of CMap. The import takes an xml file.

    An example file is supplied, data/sample_saved_links.xml. You will need to change the accessions (*_acc) to reflect your database even if you used the test data (since the test data doesn't specify accessions).

    Here are the main elements of the xml file.

Note: If there are multiple CMap installations on the same machine, you may have to specify the --config_dir option to be sure that cmap_admin.pl is using the correct config directory.

Back to Top


REDUCING QUERY CACHE SIZE

cmap_reduce_cache_size.pl

cmap_reduce_cache_size.pl will help limit the growth of the query cache files.

This script cycles through each CMap data_source and (using the Cache::SizeAwareFileCache functionality) reduces the size of the query cache to the value given as 'max_query_cache_size' in the config file. It first removes any expired entries and then if it is still over the limit, it moves to last accessed entries. If 'max_query_cache_size' is not set, it will use the default value in Bio::GMOD::CMap::Constants. It is suggested that this script is run periodically as a cron job.

Back to Top


VALIDATE CONFIG FILES

cmap_validate_config.pl

This script will test a config file to determine if it is valid or not. It currenly only tests the individual data_source files and not the global.conf.

Run it on individual config files.

  $ validate_cmap_config.pl config_file.conf

The last line of the output is most important.

  The config file, config_file.conf is valid.
  or
  The config file, config_file.conf is INVALID.

You can read through the output to find out why it is invalid, if that is the case.

The script will also output options that are missing but not required to let you know what other options you can add. It will also tell you if you are using any deprecated options.

Back to Top


cmap_matrix_compare.pl

The cmap_matrix_compare.pl script contains full documentation in POD format. To read, please execute ``perldoc <script_name>''.

Back to Top


LINKING IN

Most likely, you'll want to link directly into the CMap viewer from some other part of your site.

To link to just one map, make it the ``reference'' map by using the accession IDs for the map itself and optionally,the map's parent ``set''. Here's an example showing just one map on Gramene, showing all the feature labels and highlighting the feature ``RM9'':

  http://www.gramene.org/db/cmap/map_details?ref_map_set_aid=cu-dh-2001;ref_map_aids=cu-dh-2001-1;highlight="RM9";data_source=Gramene;label_features=all

Following should help you create your own CMap URL:

URI

/cgi-bin/cmap/viewer?

URL Arguments

Required Arguments:

Operational Arguments

Arguments for Speed and Clarity

Menu Arguments

Hide/Display Menus

These arguments allow you to hide or show individual menus. Set to 1 to display and 0 to hide. They are hidden by default.

Just in case you see it

These will probably not be used in a constructed URL but they are discribed here in case you run across them and wondered what they do. Also, it's good to document them anyway.

Map Details Arguments

Recently, the Map Details page has been folded into the regular viewer.

The URI of the Map Details page is ``/cgi-bin/cmap/map_details?''

Depricated Arguments

If you a are using these in a URL, please change the URL. They are not guaranteed to work.

Map Accession Information Format

Drawing information about maps (cropping and magnification) can be inserted with the map accession in the following fields.

The format is ``accession[start*stopxmagnification]''. Any position can be left out but the ``*'' is required. The ``x'' can be left out if there is no magnification.

Examples:

cu-sl-1994-2[200*400x1]

Correct, crops from 200 to 400 with normal magnification

cu-sl-1994-2

Correct, no cropping with normal magnification

cu-sl-1994-2[*400]

Correct, crops from the start to 400 with normal magnification

cu-sl-1994-2[*x3]

Correct, no cropping with 3 times magnification

cu-sl-1994-2[200]

Incorrect, need the ``*'' otherwise we can't tell if you want this to be start of end.

cu-sl-1994-2[x3]

Incorrect, need the ``*'' because we say so.

Comparative Map Construction

To include some number of comparative maps, you provide them in the ``comparative_maps'' argument, which is a single structured string that lists all the comparative maps and their placement relative to the reference map. Here I'd like to introduce the concept of ``slots,'' where the maps (or map sets) fall into a slot moving in positive and negative direction away from the reference map, which is in slot ``0,'' like so:

           -      -        -         -
           |      |        |         |
           |      |      - |  -      |
           |      |      | -  |      |
           |      |      |    |      |
           |      |      |    -      |
           -      -      -           -
          -1      0         1        2
  <----negative --+---------positive----->

The above drawing is representative of Gramene's genetic maps in slots -1, 0, and 2, and a physical map in slot 1. Slots are separated in the string by URI-escaped colons and the integral parts of the slot by URI-escaped equal signs. Like so

    "comparative_maps="
    +
        .-  <slot_number> + 
        |  "%3D" + 
  slot  |  <"map_acc" or "map_set_acc"> + 
        |  "%3D" +
        `-  <map_acc or map_set_acc>
    +
    "%3A"
    +
   <next slot>

Here is a sample that puts ``Rice-Cornell RFLP 2001-1'' on the left (slot ``-1'') and ``Rice-CTIR 2000-1'' on the right (slot ``1''):

    comparative_maps=1%3Dmap_acc%3D423%3A-1%3Dmap_acc%3D1

The middle part of the slot is one of the literal strings ``map_set_acc'' or ``map_acc.'' If you wanted to display just a single map in a slot, use the string ``map_acc'' and the map's accession ID. If you want a whole map set in a slot (e.g., a physical map set like ``I-Map''), then use the string ``map_set_acc'' and the map set's accession ID. You can use the admin interface (``/cgi-bin/cmap/admin'') to easily find the accession IDs.

Back to Top


USING CMAP MODULES IN CUSTOM SCRIPTS

There are several CMap modules that can be used by scripts outside of the cgi. To view more documentation on module use, execute 'perldoc Module_Name.pm``.

Modules that are particularily useful:

Back to Top


SAMPLE DATA

The data/ directory has some example data. The sample-dump.sql.gz file contains a full set of CMap data. The tabtest* files also give some simple example files for importing.

If you would like to see other sample import data for CMap, you can find many examples from the Gramene site:

  ftp://ftp.gramene.org/pub/gramene/CURRENT_RELEASE

Back to Top


MULTIPLE CMAP INSTALLATIONS

There may be occasion to run multiple CMap installations on the same server. To do so, use the customizable install options during the ``perl Build.PL''. The following is an example that creates a secondary install in the /usr/local/apache2/htdocs/cmap2/ directory.

  perl Build.PL \
    PREFIX=/usr/local/apache2/ \
    CONF=/usr/local/apache2/conf/cmap.conf2 \
    WEB_DOCUMENT_ROOT=/usr/local/apache2/htdocs/ \
    CGIBIN=/usr/local/apache2/htdocs/cmap2/cgi-bin \
    HTDOCS=/usr/local/apache2/htdocs/cmap2/

Here is the breakdown of the options:

The options not used in the example are ``TEMPLATE'', ``CACHE'' and ``SESSIONS''. These can also be set if so desired.

To access the CMap from the above example, the address will be ``http://127.0.0.1/cmap2/''. The cmap script will be at ``http://127.0.0.1/cmap2/cgi-bin/cmap''.

IMPORTANT NOTE: When creating the individual config files, the ``name'' field in ``<database>'' should be unique across the whole server. Using the same ``name'' for multiple config files, even if they are in different installs, can result in clashes over shared cache space.

OTHER NOTES:

Back to Top


MISCELLANEOUS

In the ``docs'' directory, you will find two images that might help you understand the relationships in the CMap tables, ``cmap-schema-graph.png'' and ``cmap-schema.png,'' both of which attempt to describe the tables and fields and their relationships. See also ``cmap-schema-desc.html'' for an HTML document describing the schema. These documents were automatically generated from schema definitions via scripts included with SQL::Translator, a set of modules which grew out of the author's constant need to make schema changes and quickly replicate them amongst the different test databases. The Oracle, PostgreSQL, SQLite and Sybase schemas were also generated by SQL::Translator from the MySQL schema, so if you see room for improvement, please relay them to the author, preferably via the SQL:::Translator mailing list. SQL::Translator is available on CPAN. For more information, see here:

  http://sqlfairy.sourceforget.net/

Back to Top


TROUBLESHOOTING

Here are the solutions to a few common problems:

Back to Top


AUTHOR

Ken Y. Clark, kclark@cshl.edu Ben Faga, faga@cshl.edu

Copyright (c) 2002-6 Cold Spring Harbor Laboratory

Back to Top