TIBCO Spotfire Miner™ 8.2Java/C++ ExtensibilityNovember 2010TIBCO Software Inc.
10• x-position (if in a worksheet)• y-position (if in a worksheet)• Property values set in the node property dialog
11For example, the XML for the Correlations node in the Explorer pane is:<ActivityNode engineClass= "com.insightful.miner.CorrelationsEn
12If your new components are implemented entirely using S-PLUS script nodes, and you have no special property dialogs or viewers then the jar file is
13extension.xml FilesIf the file extension.xml exists within an extension subdirectory, it must be an XML file that describes exactly which files comp
14• libraryFile: Value is one or more C++ library files that is loaded when the extension is processed.•imageDirectory: Value is one or more directori
15DEVELOPMENT TOOLSYou will need Java and possibly C++ compilers to create new nodes.Java Compiler To compile the Java code, you will need a Java comp
16Optional Tools While the compilers are the only tools strictly required for developing new nodes, we use some additional tools as a standard part of
17ARCHITECTURE FEATURESThe architecture for Spotfire Miner has a number of key features that are important for understanding how the application works
18DateDate columns are used to represent dates and times. They are stored as a long representing the number of milliseconds since an origin of Januar
19CreatedA created node has been created and possibly linked to other nodes, but does not have all of its required property values specified. The use
2IMPORTANT INFORMATIONSOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH EMBEDDED OR BUNDLED TIBCO SOFTWARE IS SOLELY TO ENABLE T
20INFRASTRUCTURE JAVA CLASSESSome important classes for storing and exchanging information are XTProps for storing property information and XTMetaData
21GRAPHICAL USER INTERFACE CLASSESOverview Spotfire Miner is structured as a client/server application. Each component such as a Read File node has a
22Constructors The constructor method is typically empty. When the constructor is called, the node properties have not been initialized. Any subclas
23Column Filter GeneratorSome nodes are able to create a Filter Columns node where the columns selected are determined based upon the node’s computati
24Node Dialog The user has great latitude in how the node’s properties dialog is implemented. In the application each node has its own class extendin
25ENGINE CLASSESOverview For each type of node, there is a class extending EngineNode that is responsible for providing metadata to the activity node
26Output Meta DataThe calculateOutputMetaData() method is called to determine the names and types of the output columns. The pipeline takes care of c
27An example execute() method is in the section Simple Java Version on page 28.C++ Procs If the computation is to be performed in C++, the CNKProc cla
28EXAMPLE: COPYING INPUTSThis section presents a simple example of the steps needed to create a new node. We will create a node that simply copies d
29/** * Very simple implementation of engine node copying each * input to the corresponding output. Assumes the same * number of inputs as outputs.
3identification purposes only. This software may be available on multiple operating systems. However, not all operating system platforms for a specifi
30 * is executed. */ public void execute(CNKProcJavaTransform proc) { for (int i=0; i<getNumInputs(); i++) { proc.copyData(i, 0, 0, i,
31CompilationNow that we have our Java code, we need to compile it and place it in a jar file. The specific steps for compiling the code and creating
32<DisplayInfo labelText="First Copy Columns"defaultLabelText="First Copy Columns"smallIcon="default_small.gif"largeI
33Engine Node The engine node corresponding to a C++ proc differs from the version for the straight Java implementation in a variety of ways:• The cla
34 /** * Passes the input column name/type information as the * output information. */ public XTMetaData calculateOutputMetaData(int outputNum
35 * DLL */ public void createPeerObject() { createCNKObject("cnkcopy", new String[] { "CNKProcCppSecondCopy", "
36#include "CNKProcCppSecondCopy.h"CNK_DEFINE_ACCESSIBLE_CLASS(CNKProcCppSecondCopy)CNKProcCppSecondCopy::CNKProcCppSecondCopy() : CNKPro
37Now that we have the C++ source and header files, we need to compile the code to create the DLL.CompilationThe programming examples directory contai
38 numOutputs="2" > <DisplayInfo labelText="Second Copy Columns" defaultLabelText="Second Copy Columns"
39public class ThirdCopyNodeModel extends ActivityNodeModel { /** * Boilerplate constructor. */ public ThirdCopyNodeModel() { } /** * Show t
4Important Information 2Overview 6Extension Files 7Default Extension File Names 8Explorer IML Files 9Java Files 11C++ Library Files 12Image Files 12He
40 } catch (Exception e) { e.printStackTrace(); valid = false; } return valid; } /** * Display the cached input metadata as HTM
41The column selection control is a list box of column names with selection indicating which Columns to Copy. Use the standard Windows selection mech
42 private ThirdCopyDialog() { super(); pack(); setMinimumSize(new Dimension(500,500)); } /** * Restore the list of column names and sel
43 } } /** * Method called by the dialog to save properties in Model */ public void saveProperties() throws NodeDialog.DialogExcept
44 listModel = new DefaultListModel(); listBox = new JList(); listBox.setModel(listModel); listBox.setSelectionMode( ListSe
45 JOptionPane.INFORMATION_MESSAGE); }}Engine Node The engine node implementation displays a variety of functionality:• Return the metadata fo
46 /** * Empty constructor just uses the super method. */ public ThirdCopyEngineNode() { } /** * Boilerplate for specifying this class provi
47 /** * Store the column names to be referred to when * executing. This shows how to store information for * use in multiple chunks. */ pu
48 // Some error checking if (inColNum < 0) { proc.addError("Column '" + colName + "' n
49 m_columnNames = null; }}XML Description The XML description for this node differs from the previous versions in the engine class name, GUI clas
5Special Interfaces 22Node Dialog 24Viewers 24Engine Classes 25Overview 25General Methods 25Constructors 25Initialization 25Output Meta Data 26CNKProc
50THE TIBCO SPOTFIRE PIPELINEOverview The TIBCO Spotfire Pipeline is a C++ system for accessing and manipulating very large data sets. The core of the
51PipelineA Pipeline object contains a set of bufs and procs. When a pipeline is executed, it repeatedly executes the procs, which read data from and
52FactorThe factor data type is much more complicated. A factor column maintains a list of strings, representing the levels of the factor. When a str
53StringString columns are used for informational columns such as names or addresses that do not represent categories and are not used in computations
54enough rows available), but the proc writing to that buf cannot write to it (because the buf doesn't have enough free rows for writing to). In
55Cnkmisc LibraryThe cnkmisc library contains the C++ code for the various C++ procs in Spotfire Miner. This includes components such as linear regre
56These classes, and their publicly-accessible methods, will be described below.All of these classes have similarly-named header files (CNKObj.h, CNKB
57These methods set and get a name associated with the CNK object. The initial value for name is NULL.This is a good opportunity to mention several a
58 severity_warning, severity_error }; void addError(const char* msg); void addWarning(const char* msg); void addInformation(const char* msg)
59only storing messages for the first few rows. The getNumMessagesAtLevel() method returns the number of messages at a specified severity level.CNKOb
6OVERVIEWSpotfire Miner is written in Java and C++. The graphical user interface and some computational components are in Java. The underlying pipel
60 static int isLevelNumNA(long val); static const char* getStringNA(); static int isStringNA(const char* val); static CNKTimeDate getTime
61 static CNKConverter* getDefaultConverter();These methods support conversions between string and double or time/date values. Such conversions de
62CNKPropertyInfo Information other than the actual data is stored and exchanged using properties. A property is a name/value pair where the name uni
63These methods are similar to getPropAsInt, etc., except that they can interpret the objects representing pointers to CNKBuf and CNKProc objects. Fo
64 if (propInfo->isPropName("seed")) { setSeed(propInfo->getPropAsINT32()); } else if (propInfo->isPropName("total.
65CNKBuf, CNKBufReader, CNKBufWriter: Data BuffersThe three classes CNKBuf, CNKBufReader, and CNKBufWriter together are used to implement a first-in-f
66immediately: it does not support "suspending" a CNKBuf data access method, if the data is not currently available. To handle these limita
67When a CNKMemoryBuf or CNKBackingFileBuf, is passed to one of the CNKProc::setInputBuf or CNKProc::setOutputBuf methods, this actually creates an in
68CNKBuf:: void setNumRows(long numRows); void setNumColumns(int cols); long getNumRows(); int getNumColumns();Set/get the number of rows
69 column_type_string, column_type_timeDate}; void setColumnType(int colNum, int typeNum); int getColumnType(int colNum); int columnTypeI
7EXTENSION FILESThe Spotfire Miner software is implemented using a large number of Java and C++ object files, image files, etc, stored in various subd
70integer part of this number is the number of days, and the fractional part represents a fraction of a day, e.g. the hours, minutes, and seconds.CNKB
71The "MaxAutoLevels" property determines the maximum number of levels that will be automatically created. The actual number of levels in a
72CNKBuf will call CNKBufReader::getEOF() to check whether there EOF has been set. CNKBuf::getEOF() gives the same information: it returns true (non-
73To avoid this problem, the CNKBuf must have at least as many rows as the largest possible CNKBufWriter::setRequestRows value plus the largest possib
74single buf reader can access the data sequentially, or read the data in multiple passes or via random access. As different parts of the data are ac
75These methods return information about the storage of the data in the backing file. getBackingFileRowBytes returns the number of bytes for each row
76increasing, until releaseRows is called to release some of the read data rows. Therefore, it is safe to save the value returned by getRowsReady, an
77 virtual double getDouble(long rowNum, int columnNum); virtual long getLevelNum(long rowNum, long colNum); virtual const char* getString(lo
78releaseAll() should be called when the reader is no longer interested in reading this data. After calling releaseAll(), no requests will be satisfi
79CNKBufWriter MethodsMost of the methods for CNKBufWriter are similar to those for CNKBufReader. Like the other class, where is no public constructo
8• Possibly a Windows DLL with compiled C++ code.• Possibly help files in a format such as compiled HtmlHelp• Possibly documentation in a format such
80isReady returns true (non-zero) if the CNKBuf is ready to be written to. Normally, this is true if getRowsReady() is greater than or equal to getRe
81With setConvertFactor, if the column is actually a double column, the argument is registered as a factor level for the column (by calling CNKBuf::ma
82 virtual INT64 getChunkPosition();At any time, a CNKBufWriter can write a "chunk" of data rows in a potentially-very-large series of da
83Likewise, the subclass init method should call the CNKProc::init(), as follows:void CNKProcCount::init(){CNKProc::init();...}The destructor ~CNKPro
84After inputs and outputs have been created, they can be accessed with getInputBuf and getOutputBuf (returning the linked CNKBuf objects), and getInp
85At a rough level of detail, the pipeline engine works by repeatedly scanning through all of the CNKProc objects. For each CNKProc, the engine calls
86Each CNKProc contains a field counting the number of times that it has been executed. This field is incremented by the pipeline engine, which calls
87of a class, this creates and exports a regular C function (with a long, ugly name including the class name) that creates an instance of that class.
881. Request input and output data rows.The execute() method must call CNKBufReader::setRequestRows to request access to its input data, and call CNKB
89If the CNKProc has outputs, the execute() method may set the column names and types of the output CNKBuf objects if they are not set already. This
9In this case, the extension subdirectory is named my_extension, and the extension uses two gif files.Explorer IML FilesAn Explorer IML file is an XML
90Releasing rows and performing these other tasks can often by done by calling CNKProc::executeReleaseRows(long rows), passing the number of rows to r
91After reading a set of input rows, the accumulated max, min, counts, and crosstab values can be extracted from the proc. CNKProcCount:: CNKProcCo
92getCountNA returns the number of NA values read. getCountOK returns the number of non-NA values read.CNKProcCount:: int getColumnNumLevels(int c
93setCrossIndex sets the crosstab level for the crosstab column specified by crossNum. The level for this column is set to levelNum, which is interpr
94These methods set/get the total number of rows to be written by this proc, before the proc sends an EOF to the output buf, and specifies that it is
95CNKProcPrintfThe CNKProcPrintf proc prints information about each block of data. It prints the position of the block in the data, the number of row
96These are the constructor, destructor, and init() methods. The execute() method is run when this CNKProc is executed within a pipeline.CNKProcRando
97 CNKProc* getProc(int i); CNKProc* getProc(const char* name); void addBuf(CNKBuf* buf); void removeBuf(CNKBuf* buf); int getNumBufs()
98 INT64 getLastExecutionCount(); INT64 getTotalExecutionCount();These methods return the number of times that the pipeline has executed a CNKPr
Comentarios a estos manuales