PCO ver. 2.0
jump to main text Last update 20 May, 2005

-------------------------------------------
I'm sorry that the PCO ver.1.0 has a mistake in its calculation process. If you have ever downloaded and installed PCO ver.1.0 on you computer, please replace it by the current version of PCO (i.e., PCO ver.2.0).
-------------------------------------------

PCO ver.2.0: MS-DOS program for principal coordinate analysis

 

Description

A small MS-DOS program for conducting principal co-ordinate analysis proposed by Gower (1966).

This program is written by C language based on Tanaka and Tarumi (1995).

 

Download

pco.zip (47kb)

This file contains the following files:

  • pco.exe ... Executable file of PCO
  • pco.c, matrix.c ... source files (C language)
  • infile.dat ... sample input file (tab delimited text format)
  • outfile.csv ... sample output file (CSV format)

 

Installation

Extract all files contained in "pco.zip", and copy them into a folder which locates anywhere you want.


An example of folder name)

c:\Apps\pco

 

Execution

  1. Open a MS-DOS command prompt window.
  2. type
    cd [the folder (directory) name which you install pco files]
  3. type
    pco [input file name] [output file name]
  4. That's all! Please enjoy PCO analysis! 

A screen shot of the MS-DOS command prompt window)

 

Input file for PCO

Input file is a tab delimeted text format which has a data structure described in the following.
You can input these data by MS-Excel, and save it as a tab delimeted text format file.
-----------------------------

[the number of samples]
[similarity (s) between the 1st and 1st samples] [s between the 1st and 2nd samples] ... [s between the 1st and n-th samples][s between the 2nd and 1st samples] ...                             [s bewteen 2nd and nth samples]
[s between the n-th and 1st samples] ...                             [s between n-th and n-th samples]

----------------------------
You can obtain a similarity matrix from your distance matrix in several ways. For example, you can calculate similarity between the i-th and j-th samples (the i x j th element of similarity matrix) as follows.

    e_ij = -d_ij^2 / 2

    e_ii = 0

where d_ij^2 means the squared distance between the i and j th samples (the i x j th element of squared distance matrix) (Tanaka and Tarumi 1995, P188).

If you can directly obtain the similarity matrix, please use it directly in PCO analysis (for example, it is a case of the Nei's genetic similarity matrix).

-------Caution-------
Please input a "Similarity" (not "Distance") matrix!
--------------------



A screen shot of data input work on Excel)

 

Output file of PCO

The output file is formated as a csv (camma separated value) file.
Its contents are as follows:

---------------------------------------
[SIMILARITY MATRIX]
,0.000000,-1.000000,-5.000000,-17.000000,-20.000000,-25.000000,-13.000000,-9.000000
,-1.000000,0.000000,-4.000000,-16.000000,-25.000000,-32.000000,-20.000000,-16.000000
,-5.000000,-4.000000,0.000000,-4.000000,-13.000000,-20.000000,-16.000000,-20.000000
,-17.000000,-16.000000,-4.000000,0.000000,-9.000000,-16.000000,-20.000000,-32.000000
,-20.000000,-25.000000,-13.000000,-9.000000,0.000000,-1.000000,-5.000000,-17.000000
,-25.000000,-32.000000,-20.000000,-16.000000,-1.000000,0.000000,-4.000000,-16.000000
,-13.000000,-20.000000,-16.000000,-20.000000,-5.000000,-4.000000,0.000000,-4.000000
,-9.000000,-16.000000,-20.000000,-32.000000,-17.000000,-16.000000,-4.000000,0.000000


[DOUBLE CENTERING MATRIX]
,10.000000,12.000000,4.000000,-4.000000,-10.000000,-12.000000,-4.000000,4.000000
,12.000000,16.000000,8.000000,0.000000,-12.000000,-16.000000,-8.000000,0.000000
,4.000000,8.000000,8.000000,8.000000,-4.000000,-8.000000,-8.000000,-8.000000
,-4.000000,0.000000,8.000000,16.000000,4.000000,0.000000,-8.000000,-16.000000
,-10.000000,-12.000000,-4.000000,4.000000,10.000000,12.000000,4.000000,-4.000000
,-12.000000,-16.000000,-8.000000,0.000000,12.000000,16.000000,8.000000,0.000000
,-4.000000,-8.000000,-8.000000,-8.000000,4.000000,8.000000,8.000000,8.000000
,4.000000,0.000000,-8.000000,-16.000000,-4.000000,0.000000,8.000000,16.000000


[EIGEN VALUE],PCO1,PCO2,PCO3,PCO4,PCO5,PCO6,PCO7,PCO8
,58.246211,41.753789,0.000000,0.000000,0.000000,-0.000000,-0.000000,-0.000000
[CONTRIBUTION],PCO1,PCO2,PCO3,PCO4,PCO5,PCO6,PCO7,PCO8
,0.582462,0.417538,0.000000,0.000000,0.000000,-0.000000,-0.000000,-0.000000
[CONTRIBUTION],PCO1,PCO2,PCO3,PCO4,PCO5,PCO6,PCO7,PCO8
,0.582462,0.417538,0.000000,0.000000,0.000000,-0.000000,-0.000000,-0.000000


[EIGEN VECTOR],PCO1,PCO2,PCO3,PCO4,PCO5,PCO6,PCO7,PCO8
,-0.374131,0.210324,-0.000000,0.000000,0.000000,0.000000,0.000000,0.903211
,-0.520188,0.075635,-0.023405,0.509192,0.000000,-0.057395,-0.637367,-0.233087
,-0.292113,-0.269379,-0.830017,-0.331105,0.000000,0.198909,-0.024201,-0.058272
,-0.064038,-0.614392,0.268143,-0.062099,0.707107,0.107478,-0.132401,0.116543
,0.374131,-0.210324,-0.320687,0.753554,0.000000,0.263075,0.185208,0.203951
,0.520188,-0.075635,-0.239004,-0.120162,-0.000000,-0.535427,-0.557773,0.233087
,0.292113,0.269379,0.082212,-0.202726,0.000000,0.760360,-0.461200,0.058272
,0.064038,0.614392,-0.268143,0.062099,0.707107,-0.107478,0.132401,-0.116543


[PCO SCORE],PCO1,PCO2,PCO3,PCO4,PCO5,PCO6,PCO7,PCO8
,-2.855339,1.359057,-0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
,-3.970030,0.488733,-0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
,-2.229382,-1.740649,-0.000000,-0.000000,0.000000,0.000000,0.000000,0.000000
,-0.488733,-3.970030,0.000000,-0.000000,0.000000,0.000000,0.000000,0.000000
,2.855339,-1.359057,-0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
,3.970030,-0.488733,-0.000000,-0.000000,-0.000000,0.000000,0.000000,0.000000
,2.229382,1.740649,0.000000,-0.000000,0.000000,0.000000,0.000000,0.000000
,0.488733,3.970030,-0.000000,0.000000,0.000000,0.000000,0.000000,0.000000

------------------------------------------

[SIMILARITY MATRIX] is a similarity matrix inputted by a user.
[DOUBLE CENTERING MATRIX] is a double centering matrix A. When the i x j th element of the similarity matrix is indicated as e_ij,

The i x j th element of the double centering matrix is calculated as
a_ij = e_ij - e_i. - e_.j + e_..

where e_ij is the i x j th element of the similarity matrix. e_i., e_.j, and e_.. are the averages of elements of i th row, j th colum, and overall of the similarity matrix, respectively.

[EIGEN VALUE] are the eigen values of matrix A

[CONTRIBUTION] and [CUMULATIVE CONTRIBUTION] are the contribuitons and cumulative contribution of eigen vectors of matrix A, respectively.

[EIGEN VECTOR] are the eigen vector of matrix A
[PCO SCORE] are the score of the principal coordinate obtained from your similarity matrix!
You can visualize the location of each sample on a principal co-ordinate plane using this matrix. The i x j th element of this matrix corresponds to the j th co-ordinate value of the i th sample. For example, the 1st, 2nd, and 3rd samples locate on (-2.85, 1.36), (-3.97, 0.49), (-2.23, -1.74) on the 1st and 2nd principal co-ordinate plane, respectively. Cumulative contribution reaches 1.0 at the 2nd co-ordinate, indicated all the information contained in the similarity matrix is explained by the 1st and 2nd principal co-ordinates.

References

Gower, J.C. (1966) Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53: 325-38.

Tanaka and Tarumi (1995) Handbook of statistical analysis for Windows (in Japanese).Kyoritu-shuppan, Tokyo.