1. Identity statement | |
Reference Type | Conference Paper (Conference Proceedings) |
Site | mtc-m21c.sid.inpe.br |
Holder Code | isadg {BR SPINPE} ibi 8JMKD3MGPCW/3DT298S |
Identifier | 8JMKD3MGP3W34R/43FHDQS |
Repository | sid.inpe.br/mtc-m21c/2020/10.25.14.55 (restricted access) |
Last Update | 2020:10.25.14.55.28 (UTC) simone |
Metadata Repository | sid.inpe.br/mtc-m21c/2020/10.25.14.55.28 |
Metadata Last Update | 2022:01.04.01.35.31 (UTC) administrator |
Secondary Key | INPE--PRE/ |
DOI | 10.1007/978-3-030-58799-4_74 |
ISBN | 978-303058798-7 |
ISSN | 03029743 |
Citation Key | PinheiroSilvSoarQuil:2020:GrClAn |
Title | A graph-based clustering analysis of the QM9 dataset via SMILES descriptors |
Year | 2020 |
Access Date | 2024, May 04 |
Secondary Type | PRE CI |
Number of Files | 1 |
Size | 2547 KiB |
|
2. Context | |
Author | 1 Pinheiro, Gabriel Augusto Lins Leal 2 Silva, Juarez L. F. da Silva 3 Soares, Marinalva D. 4 Quiles, Marcos Gonçalves |
ORCID | 1 2 3 4 0000-0001-8147-554X |
Group | 1 LABAC-COCTE-INPE-MCTIC-GOV-BR |
Affiliation | 1 Instituto Nacional de Pesquisas Espaciais (INPE) 2 Universidade de São Paulo (USP) 3 Universidade Federal de São Paulo (UNIFESP) 4 Universidade Federal de São Paulo (UNIFESP) |
Author e-Mail Address | 1 gabriel.pinheiro@inpe.br 2 juarez.dasilva@iqsc.usp.br 3 mdiasoraes@gmail.com 4 quiles@unifesp.br |
Editor | Gervasi, O. Murgante, B. Misra, S. Garau, C. Blecic, I. Taniar, D. Apduhan, B. O. Rocha, A. M. A. C. Tarantino, E. Torre, C. M. Karaca, Y. |
Conference Name | International Conference on Computational Science and Its Applications (ICCSA), 20 |
Conference Location | Cagliari, Italy |
Date | 01-04 July |
Publisher | Springer |
Pages | 421-433 |
Book Title | Proceedings |
History (UTC) | 2020-10-25 14:55:28 :: simone -> administrator :: 2022-01-04 01:35:31 :: administrator -> simone :: 2020 |
|
3. Content and structure | |
Is the master or a copy? | is the master |
Content Stage | completed |
Transferable | 1 |
Content Type | External Contribution |
Version Type | publisher |
Keywords | Clustering · Graph · Quantum-chemistry |
Abstract | Machine learning has become a new hot-topic in Materials Sciences. For instance, several approaches from unsupervised and supervised learning have been applied as surrogate models to study the properties of several classes of materials. Here, we investigate, from a graphbased clustering perspective, the Quantum QM9 dataset. This dataset is one of the most used datasets in this scenario. Our investigation is twofold: 1) understand whether the QM9 samples are organized in clusters, and 2) if the clustering structure might provide us with some insights regarding anomalous molecules, or molecules that jeopardize the accuracy of supervised property prediction methods. Our results show that the QM9 is indeed structured into clusters. These clusters, for instance, might suggest better approaches for splitting the dataset when using cross-correlation approaches in supervised learning. However, regarding our second question, our finds indicate that the clustering structure, obtained via Simplified Molecular Input Line Entry System (SMILES) representation, cannot be used to filter anomalous samples in property prediction. Thus, further investigation regarding this limitation should be conducted in future research. |
Area | COMP |
Arrangement | urlib.net > BDMCI > Fonds > Produção anterior à 2021 > LABAC > A graph-based clustering... |
doc Directory Content | access |
source Directory Content | there are no files |
agreement Directory Content | |
|
4. Conditions of access and use | |
Language | en |
Target File | pinheiro_graph.pdf |
User Group | simone |
Visibility | shown |
Read Permission | deny from all and allow from 150.163 |
Update Permission | not transferred |
|
5. Allied materials | |
Next Higher Units | 8JMKD3MGPCW/3ESGTTP |
Host Collection | urlib.net/www/2017/11.22.19.04 |
|
6. Notes | |
Notes | Lecture Notes in Computer Science, v.12249 |
Empty Fields | archivingpolicy archivist callnumber copyholder copyright creatorhistory descriptionlevel dissemination e-mailaddress edition format label lineage mark mirrorrepository nextedition numberofvolumes organization parameterlist parentrepositories previousedition previouslowerunit progress project publisheraddress readergroup resumeid rightsholder schedulinginformation secondarydate secondarymark serieseditor session shorttitle sponsor subject tertiarymark tertiarytype type url volume |
|
7. Description control | |
e-Mail (login) | simone |
update | |
|