data_structures
Class VladArray

java.lang.Object
  extended by data_structures.AbstractSearchStructure
      extended by data_structures.VladArray

public class VladArray
extends AbstractSearchStructure

This class can be used for storing vectors and also for performing k-nearest neighbor queries.

Author:
Eleftherios Spyromitros-Xioufis

Field Summary
private  gnu.trove.list.array.TIntArrayList idsList
          The ids of the corresponding images are stored in this int array.
private  boolean loadIndexInMemory
          Whether the index will be loaded in memory or not.
private  long totalpersistentIndexUpdateTime
          The total time taken to update the persistent index.
private  boolean useDiskOrderedCursor
          Whether to use a disk ordered cursor or not.
private  gnu.trove.list.array.TDoubleArrayList vectorsList
          The vectors are stored here.
private  com.sleepycat.je.Database vladArrayBDB
          Berkeley db for persistence of the VladArray data structure
 
Fields inherited from class data_structures.AbstractSearchStructure
cachePercent, cacheSize, countSizeOnLoad, dbEnv, idToNameBDB, idToNameMappings, loadCounter, maxNumVectors, nameToIdBDB, numIndexingThreads, readOnly, syncRate, transactional, vectorLength
 
Constructor Summary
VladArray(int vectorLength, int loadCounter, int maxNumVectors, java.lang.String BDBEnvHome, boolean loadIndexInMemory, boolean countSizeOnLoad, boolean readOnly)
          Advanced constructor.
VladArray(int vectorLength, int maxNumVectors, java.lang.String BDBEnvHome)
          Simple constructor.
 
Method Summary
 void closeInternal()
          Each subclass should implement this method to close the BDB databases that it uses.
protected  com.aliasi.util.BoundedPriorityQueue<Result> computeNearestNeighborsInternal(int k, double[] queryVector)
          Computes the k-nearest neighbors for the given query vector.
 double[] getVector(int k)
          Returns the k-th vector, either from the ram-based or from the disk-based index.
 int getVectorId(int k)
          Returns the k-th id in the ids array
protected  void indexVectorInternal(int id, double[] vector, com.sleepycat.je.Transaction txn)
          Append the vectors array with the given vector and the ids array with the given id.
private  void loadIndexInMemory(boolean diskOrdered)
          Loads the persistent index in memory.
protected  void outputIndexingTimesInternal()
          Index specific time measurements.
private  void updatePersistentIndex(int id, double[] vector, com.sleepycat.je.Transaction txn)
          Updates the persisent vlad index.
 
Methods inherited from class data_structures.AbstractSearchStructure
close, computeNearestNeighbors, createOrOpenBDBEnvAndDbs, getExternalId, getInternalId, getLoadCounter, indexVector, isIndexed, outputIndexingTimes
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

vectorsList

private gnu.trove.list.array.TDoubleArrayList vectorsList
The vectors are stored here. Note that we use a single double array for all vectors.


idsList

private gnu.trove.list.array.TIntArrayList idsList
The ids of the corresponding images are stored in this int array. This field is probably not needed since the position of a vector in the VladArray is the same as its id.


loadIndexInMemory

private boolean loadIndexInMemory
Whether the index will be loaded in memory or not.


useDiskOrderedCursor

private boolean useDiskOrderedCursor
Whether to use a disk ordered cursor or not. This setting changes how fast the index will be loaded in main memory.


vladArrayBDB

private com.sleepycat.je.Database vladArrayBDB
Berkeley db for persistence of the VladArray data structure


totalpersistentIndexUpdateTime

private long totalpersistentIndexUpdateTime
The total time taken to update the persistent index.

Constructor Detail

VladArray

public VladArray(int vectorLength,
                 int loadCounter,
                 int maxNumVectors,
                 java.lang.String BDBEnvHome,
                 boolean loadIndexInMemory,
                 boolean countSizeOnLoad,
                 boolean readOnly)
          throws java.lang.Exception
Advanced constructor.

Parameters:
vectorLength - The dimensionality of the vectors.
loadCounter - The initial value of the loadCounter.
maxNumVectors - The maximum number of vectors.
BDBEnvHome - The BDB environment home directory.
loadIndexInMemory - Whether to load the index in memory.
countSizeOnLoad - Whether the loadCounter will be initialized by the size of the persistent store.
readOnly - If true the persistent store will opened only for read access (allows multiple opens).
Throws:
java.lang.Exception

VladArray

public VladArray(int vectorLength,
                 int maxNumVectors,
                 java.lang.String BDBEnvHome)
          throws java.lang.Exception
Simple constructor.

Parameters:
vectorLength -
maxNumVectors -
BDBEnvHome -
Throws:
java.lang.Exception
Method Detail

indexVectorInternal

protected void indexVectorInternal(int id,
                                   double[] vector,
                                   com.sleepycat.je.Transaction txn)
                            throws java.lang.Exception
Append the vectors array with the given vector and the ids array with the given id. The id is assigned by the indexVector method and is equal to the current value of the load counter minus 1.

Specified by:
indexVectorInternal in class AbstractSearchStructure
Parameters:
name -
vector -
Throws:
java.lang.Exception

computeNearestNeighborsInternal

protected com.aliasi.util.BoundedPriorityQueue<Result> computeNearestNeighborsInternal(int k,
                                                                                       double[] queryVector)
                                                                                throws java.lang.Exception
Computes the k-nearest neighbors for the given query vector. The search is exhaustive but includes some optimizations which avoid many calculations.

Specified by:
computeNearestNeighborsInternal in class AbstractSearchStructure
Parameters:
k - the number of nearest neighbors to be returned
queryVector - the query vector
Returns:
the ids of the k nearest vectors
Throws:
java.lang.Exception

loadIndexInMemory

private void loadIndexInMemory(boolean diskOrdered)
                        throws java.lang.Exception
Loads the persistent index in memory.

Parameters:
diskOrdered - whether to use a diskOrderedCursor
Throws:
java.lang.Exception

updatePersistentIndex

private void updatePersistentIndex(int id,
                                   double[] vector,
                                   com.sleepycat.je.Transaction txn)
Updates the persisent vlad index.

Parameters:
id -
vector -
txn -

getVector

public double[] getVector(int k)
                   throws java.lang.Exception
Returns the k-th vector, either from the ram-based or from the disk-based index.

Parameters:
k - index of the vector to be returned (starts from 0)
Returns:
Throws:
java.lang.Exception

getVectorId

public int getVectorId(int k)
                throws java.lang.Exception
Returns the k-th id in the ids array

Parameters:
k - index of the id to be returned (starts from 0)
Returns:
Throws:
java.lang.Exception

closeInternal

public void closeInternal()
Description copied from class: AbstractSearchStructure
Each subclass should implement this method to close the BDB databases that it uses.

Specified by:
closeInternal in class AbstractSearchStructure

outputIndexingTimesInternal

protected void outputIndexingTimesInternal()
Description copied from class: AbstractSearchStructure
Index specific time measurements.

Specified by:
outputIndexingTimesInternal in class AbstractSearchStructure