Close

1. Identity statement
Reference TypeConference Paper (Conference Proceedings)
Sitemtc-m21c.sid.inpe.br
Holder Codeisadg {BR SPINPE} ibi 8JMKD3MGPCW/3DT298S
Identifier8JMKD3MGP3W34R/43FHDQS
Repositorysid.inpe.br/mtc-m21c/2020/10.25.14.55   (restricted access)
Last Update2020:10.25.14.55.28 (UTC) simone
Metadata Repositorysid.inpe.br/mtc-m21c/2020/10.25.14.55.28
Metadata Last Update2022:01.04.01.35.31 (UTC) administrator
Secondary KeyINPE--PRE/
DOI10.1007/978-3-030-58799-4_74
ISBN978-303058798-7
ISSN03029743
Citation KeyPinheiroSilvSoarQuil:2020:GrClAn
TitleA graph-based clustering analysis of the QM9 dataset via SMILES descriptors
Year2020
Access Date2024, Mar. 29
Secondary TypePRE CI
Number of Files1
Size2547 KiB
2. Context
Author1 Pinheiro, Gabriel Augusto Lins Leal
2 Silva, Juarez L. F. da Silva
3 Soares, Marinalva D.
4 Quiles, Marcos Gonçalves
ORCID1
2
3
4 0000-0001-8147-554X
Group1 LABAC-COCTE-INPE-MCTIC-GOV-BR
Affiliation1 Instituto Nacional de Pesquisas Espaciais (INPE)
2 Universidade de São Paulo (USP)
3 Universidade Federal de São Paulo (UNIFESP)
4 Universidade Federal de São Paulo (UNIFESP)
Author e-Mail Address1 gabriel.pinheiro@inpe.br
2 juarez.dasilva@iqsc.usp.br
3 mdiasoraes@gmail.com
4 quiles@unifesp.br
EditorGervasi, O.
Murgante, B.
Misra, S.
Garau, C.
Blecic, I.
Taniar, D.
Apduhan, B. O.
Rocha, A. M. A. C.
Tarantino, E.
Torre, C. M.
Karaca, Y.
Conference NameInternational Conference on Computational Science and Its Applications (ICCSA), 20
Conference LocationCagliari, Italy
Date01-04 July
PublisherSpringer
Pages421-433
Book TitleProceedings
History (UTC)2020-10-25 14:55:28 :: simone -> administrator ::
2022-01-04 01:35:31 :: administrator -> simone :: 2020
3. Content and structure
Is the master or a copy?is the master
Content Stagecompleted
Transferable1
Content TypeExternal Contribution
Version Typepublisher
KeywordsClustering · Graph · Quantum-chemistry
AbstractMachine learning has become a new hot-topic in Materials Sciences. For instance, several approaches from unsupervised and supervised learning have been applied as surrogate models to study the properties of several classes of materials. Here, we investigate, from a graphbased clustering perspective, the Quantum QM9 dataset. This dataset is one of the most used datasets in this scenario. Our investigation is twofold: 1) understand whether the QM9 samples are organized in clusters, and 2) if the clustering structure might provide us with some insights regarding anomalous molecules, or molecules that jeopardize the accuracy of supervised property prediction methods. Our results show that the QM9 is indeed structured into clusters. These clusters, for instance, might suggest better approaches for splitting the dataset when using cross-correlation approaches in supervised learning. However, regarding our second question, our finds indicate that the clustering structure, obtained via Simplified Molecular Input Line Entry System (SMILES) representation, cannot be used to filter anomalous samples in property prediction. Thus, further investigation regarding this limitation should be conducted in future research.
AreaCOMP
Arrangementurlib.net > BDMCI > Fonds > Produção anterior à 2021 > LABAC > A graph-based clustering...
doc Directory Contentaccess
source Directory Contentthere are no files
agreement Directory Content
agreement.html 25/10/2020 11:55 1.0 KiB 
4. Conditions of access and use
Languageen
Target Filepinheiro_graph.pdf
User Groupsimone
Visibilityshown
Read Permissiondeny from all and allow from 150.163
Update Permissionnot transferred
5. Allied materials
Next Higher Units8JMKD3MGPCW/3ESGTTP
Host Collectionurlib.net/www/2017/11.22.19.04
6. Notes
NotesLecture Notes in Computer Science, v.12249
Empty Fieldsarchivingpolicy archivist callnumber copyholder copyright creatorhistory descriptionlevel dissemination e-mailaddress edition format label lineage mark mirrorrepository nextedition numberofvolumes organization parameterlist parentrepositories previousedition previouslowerunit progress project publisheraddress readergroup resumeid rightsholder schedulinginformation secondarydate secondarymark serieseditor session shorttitle sponsor subject tertiarymark tertiarytype type url volume
7. Description control
e-Mail (login)simone
update 


Close