ThesearetheproceedingsofthefourthbiennialconferenceintheIntelligentData Analysisseries. TheconferencetookplaceinCascais,Portugal,13 15September 2001. Thethemeofthisconferenceseriesistheuseofcomputersinintelligent waysindataanalysis,includingtheexplorationofintelligentprogramsfordata analysis. Dataanalytictoolscontinuetodevelop,drivenbythecomputerrevo- tion. Methodswhichwouldhaverequiredunimaginableamountsofcomputing power,andwhichwouldhavetakenyearstoreachaconclusion,cannowbe appliedwitheaseandvirtuallyinstantly. Suchmethodsarebeingdevelopedby avarietyofintellectualcommunities,includingstatistics,arti?cialintelligence, neuralnetworks,machinelearning,datamining,andinteractivedynamicdata visualization. Thisconferenceseriesseekstobringtogetherresearchersstudying theuseofintelligentdataanalysisinthesevariousdisciplines,tostimulate- teractionsothateachdisciplinemaylearnfromtheothers. Soastoencourage suchinteraction,wedeliberatelykepttheconferencetoasingletrackmeeting. Thismeantthat,ofthealmost150submissionswereceived,wewereableto selectonly23fororalpresentationand16forposterpresentation. Inaddition tothesecontributedpapers,therewasakeynoteaddressfromDarylPregibon, invitedpresentationsfromKatharinaMorik,RolfBackhofen,andSunilRao,and aspecial datachallenge session,whereresearchersdescribedtheirattemptsto analyseachallengingdatasetprovidedbyPaulCohen. Thisacceptancerate enabledustoensureahighqualityconference,whilealsopermittingustop- videgoodcoverageofthevarioustopicssubsumedwithinthegeneralheading ofintelligentdataanalysis. Wewouldliketoexpressourthanksandappreciationtoeveryoneinvolved intheorganizationofthemeetingandtheselectionofthepapers. Itisthe behind-the-scenese?ortswhichensurethesmoothrunningandsuccessofany conference. Wewouldalsoliketoexpressourgratitudetothesponsors:Fundac ao paraaCi enciaeaTecnologia,Minist eriodaCi enciaedaTecnologia,Faculdade deCi enciaseTecnologia,UniversidadeNovadeLisboa,Funda c aoCalousteG- benkianandIPEInvestimentoseParticipac oesEmpresariais,S. A. September2001 FrankHo?mann DavidJ. Hand NiallAdams GabrielaGuimaraes DougFisher Organization IDA2001wasorganizedbythedepartmentofComputerScience,NewUniversity ofLisbon. ConferenceCommittee GeneralChair: DouglasFisher(VanderbiltUniversity,USA) ProgramChairs: DavidJ. Hand(ImperialCollege,UK) NiallAdams(ImperialCollege,UK) ConferenceChair: GabrielaGuimaraes(NewUniversityofLisbon,Portugal) PublicityChair: FrankH oppner(Univ. ofAppl. SciencesEmden,Germany) PublicationChair: FrankHo?mann(RoyalInstituteofTechnology,Sweden) LocalChair: FernandoMoura-Pires(UniversityofEvora,Portugal) AreaChairs: RobertaSiciliano(UniversityofNaples,Italy) ArnoSiebes(CWI,TheNetherlands) PavelBrazdil(UniversityofPorto,Portugal) ProgramCommittee NiallAdams(ImperialCollege,UK) PieterAdriaans(Syllogic,TheNetherlands) RussellAlmond(EducationalTestingService,USA) ThomasB ack(InformatikCentrumDortmund,Germany) RiccardoBellazzi(UniversityofPavia,Italy) MichaelBerthold(Tripos,USA) LiuBing(NationalUniversityofSingapore) PaulCohen(UniversityofMassachusetts,USA) PaulDarius(LeuvenUniversity,Belgium) FazelFamili(NationalResearchCouncil,Canada) DouglasFisher(VanderbiltUniversity,USA) KarlFroeschl(UniversityofVienna,Austria) AlexGammerman(RoyalHolloway,UK) AdolfGrauel(UniversityofPaderborn,Germany) GabrielaGuimaraes(NewUniversityofLisbon,Portugal) LawrenceO. Hall(UniversityofSouthFlorida,USA) FrankHo?mann(RoyalInstituteofTechnology,Sweden) AdeleHowe(ColoradoStateUniversity,USA) Klaus-PeterHuber(SASInstitute,Germany) DavidJensen(UniversityofMassachusetts,USA) JoostKok(LeidenUniversity,TheNetherlands) RudolfKruse(UniversityofMagdeburg,Germany) FrankKlawonn(UniversityofAppliedSciencesEmden,Germany) VIII Organization HansLenz(FreeUniversityofBerlin,Germany) DavidMadigan(Soliloquy,USA) RainerMalaka(EuropeanMediaLaboratory,Germany) HeikkiMannila(Nokia,Finland) FernandoMouraPires(UniversityofEvora,Portugal) SusanaNascimento(UniversityofLisbon,Portugal) WayneOldford(UniversityofWaterloo,Canada) AlbertPr
Inhaltsverzeichnis
The Fourth International Symposium on Intelligent Data Analysis. - Feature Characterization in Scientific Datasets. - Relevance Feedback in the Bayesian Network Retrieval Model: An Approach Based on Term Instantiation. - Generating Fuzzy Summaries From Fuzzy Multidimensional Databases. - A Mixture-of-Experts Framework for Learning from Imbalanced Data Sets. - Predicting Time-Varying Functions with Local Models. - Building Models of Ecological Dynamics Using HMM Based Temporal Data Clustering A Preliminary Study. - Tagging with Small Training Corpora. - A Search Engine for Morphologically Complex Languages. - Errors Detection and Correction in Large Scale Data Collecting. - A New Framework to Assess Association Rules. - Communities of Interest. - An Evaluation of Grading Classifiers. - Finding Informative Rules in Interval Sequences. - Correlation-Based and Contextual Merit-Based Ensemble Feature Selection. - Nonmetric Multidimensional Scaling with Neural Networks. - Functional Trees for Regression. - Data Mining with Products of Trees. - S 3Bagging: Fast Classifier Induction Method with Subsampling and Bagging. - RNA-Sequence-Structure Properties and Selenocysteine Insertion. - An Algorithm for Segmenting Categorical Time Series into Meaningful Episodes. - An Empirical Comparison of Pruning Methods for Ensemble Classifiers. - A framework for Modelling Short, High-Dimensional Multivariate Time Series: Preliminary Results in Virus Gene Expression Data Analysis. - Using Multiattribute Prediction Suffix Graphs for Spanish Part-of-Speech Tagging. - Self-Supervised Chinese Word Segmentation. - Analyzing Data Clusters: A Rough Sets Approach to Extract Cluster-Defining Symbolic Rules. - Finding Polynomials to Fit Multivariate Data Having Numeric and Nominal Variables. - Fluent Learning: Elucidatingthe Structure of Episodes. - An Intelligent Decision Support Model for Aviation Weather Forcasting. - MAMBO: Discovering Association Rules Based on Conditional Independencies. - Model Building for Random Fields. - Active Hidden Markov Models for Information Extraction. - Adaptive Lightweight Text Filtering. - General Algorithm for Approximate Inference in Multiply Sectioned Bayesian Networks. - Investigating Temporal Patterns of Fault Behaviour Within Large Telephony Networks. - Closed Set Based Discovery of Representative Association Rules. - Intelligent Sensor Analysis and Actuator Control. - Sampling of Highly Correlated Data for Polynomial Regression and Model Discovery. - The IDA 01 Robot Data Challenge. - The IDA 01 Robot Data Challenge.