Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scHiCcompare #3649

Open
10 tasks done
hamy12398 opened this issue Nov 6, 2024 · 21 comments
Open
10 tasks done

scHiCcompare #3649

hamy12398 opened this issue Nov 6, 2024 · 21 comments
Assignees
Labels
2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place OK

Comments

@hamy12398
Copy link

Update the following URL to point to the GitHub repository of
the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

  • I understand that by submitting my package to Bioconductor,
    the package source and all review commentary are visible to the
    general public.

  • I have read the Bioconductor Package Submission
    instructions. My package is consistent with the Bioconductor
    Package Guidelines.

  • I understand Bioconductor Package Naming Policy and acknowledge
    Bioconductor may retain use of package name.

  • I understand that a minimum requirement for package acceptance
    is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
    Passing these checks does not result in automatic acceptance. The
    package will then undergo a formal review and recommendations for
    acceptance regarding other Bioconductor standards will be addressed.

  • My package addresses statistical or bioinformatic issues related
    to the analysis and comprehension of high throughput genomic data.

  • I am committed to the long-term maintenance of my package. This
    includes monitoring the support site for issues that users may
    have, subscribing to the bioc-devel mailing list to stay aware
    of developments in the Bioconductor community, responding promptly
    to requests for updates from the Core team in response to changes in
    R or underlying software.

  • I am familiar with the Bioconductor code of conduct and
    agree to abide by it.

I am familiar with the essential aspects of Bioconductor software
management, including:

  • The 'devel' branch for new packages and features.
  • The stable 'release' branch, made available every six
    months, for bug fixes.
  • Bioconductor version control using Git
    (optionally via GitHub).

For questions/help about the submission process, including questions about
the output of the automatic reports generated by the SPB (Single Package
Builder), please use the #package-submission channel of our Community Slack.
Follow the link on the home page of the Bioconductor website to sign up.

@bioc-issue-bot
Copy link
Collaborator

Hi @hamy12398

Thanks for submitting your package. We are taking a quick
look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: scHiCcompare
Title: Differential Analysis of Single-cell Hi-C Data
Version: 0.99.0
Authors@R: c(
    person(given = "My", family = "Nguyen",
 email = "[email protected]",
 role = c("aut"),
 comment = c(ORCID = "0009-0003-1118-7085")),
    person(given = "Mikhail",
 family = "Dozmorov",
 role = c("aut", "cre"),
 email = "[email protected]",
 comment = c(ORCID = "0000-0002-0086-8358")))
Description: This package provides functions for differential chromatin interaction analysis 
    between two single-cell Hi-C data groups. It includes tools for imputation, normalization, 
    and differential analysis of chromatin interactions. The package implements pooling techniques 
    for imputation and offers methods to normalize and test for differential interactions across 
    single-cell Hi-C datasets.
Imports: 
    DT,
    data.table,
    dplyr,
    ggplot2,
    gtools,
    HiCcompare,
    lattice,
    mclust,
    mice,
    miceadds,
    rstatix,
    tidyr,
    gridExtra
Suggests: knitr,
    rmarkdown,
    testthat,
    BiocStyle
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
LazyDataCompression: xz
RoxygenNote: 7.3.2
Depends: R (>= 4.2.0)
VignetteBuilder: knitr
biocViews: Software, SingleCell, HiC, Sequencing, Normalization
BugReports: https://github.com/dozmorovlab/ScHiCcompare/issues
URL: https://github.com/dozmorovlab/ScHiCcompare

@bioc-issue-bot bioc-issue-bot added the 1. awaiting moderation submitted and waiting clearance to access resources label Nov 6, 2024
@lshep lshep added the pre-check passed pre-review performed and ready to be added to git label Nov 25, 2024
@bioc-issue-bot
Copy link
Collaborator

Your package has been added to git.bioconductor.org to continue the
pre-review process. A build report will be posted shortly. Please
fix any ERROR and WARNING in the build report before a reviewer is
assigned or provide a justification on why you feel the ERROR or
WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting
up remotes to push to git.bioconductor.org. All changes should be
pushed to git.bioconductor.org moving forward. It is required to push a
version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org
access. To manage keys and future access you may want to active your
Bioconductor Git Credentials Account

@bioc-issue-bot bioc-issue-bot added pre-review on bioconductor git and access to on demand build but not assigned reviewer until build report clean and removed 1. awaiting moderation submitted and waiting clearance to access resources pre-check passed pre-review performed and ready to be added to git labels Nov 25, 2024
@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
Linux (Ubuntu 24.04.1 LTS): scHiCcompare_0.99.0.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scHiCcompare to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@lshep
Copy link
Contributor

lshep commented Dec 3, 2024

Please fix the ERROR and do a valid version bump to get a new build report before a reviewer will be assigned

@bioc-issue-bot
Copy link
Collaborator

Received a valid push on git.bioconductor.org; starting a build for commit id: ecbec6d826f3b4d4896cb4f87bae3155887e4a6c

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "TIMEOUT, skipped".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scHiCcompare to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push on git.bioconductor.org; starting a build for commit id: 9abe068e23f8be33cd5f6107e53eacc1f4a168d5

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "TIMEOUT, skipped".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scHiCcompare to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@hamy12398
Copy link
Author

Hi @lshep,
My first report error is because I had extra file. To solve, I remove that file from my first push. As a result, I got the TIMEOUT problem when installing vignette. I have re-run and checked on my local laptop, but I did not have the TIMEOUT problem as the report. I wonder if there is a way I can check how my package looks like on Bioconductor environment. I am not too sure the cause of this problem.
Thank you so much.

@lshep
Copy link
Contributor

lshep commented Dec 23, 2024

I also just tried locally and killed it after 40 min!!! The package should be able to be build/checked in ideally under 15 min. Maybe a smaller dataset?

@lshep
Copy link
Contributor

lshep commented Dec 23, 2024

When I just did a Stangle to have just the R code to run manually, when I run manually it seems slow/stuck on the "fitting" step of scHiCcompare function call

> # scHiCcompare(file.path.1, file.path.2,
> #   select.chromosome, .... [TRUNCATED] 
Imputing Condition 1 group cells in: 
pooled band 1, 
pooled band 2, 
pooled band 3, 
pooled band 4, 
pooled band 5, 
pooled band 6, 
pooled band 7, 
pooled band 10, 
pooled band 11, 
pooled band 12, 
pooled band 8, 
pooled band 9, 

Imputing Condition 2 group cells in: 
pooled band 1, 
pooled band 2, 
pooled band 3, 
pooled band 4, 
pooled band 5, 
pooled band 6, 
pooled band 7, 
pooled band 8, 
pooled band 9, 
pooled band 10, 
pooled band 11, 
pooled band 12, 

Transfering into pseudo-bulk sparse matrix.

Transfering into pseudo-bulk sparse matrix.

Jointly normalizing pseudo bulk matrices 
Span for loess: 0.0399471795291325
GCV for loess: 0.00017267469496343
AIC for loess: -0.0149900024956055
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

Processing detect differential chromotin interaction 
Span for loess: 0.0622079693724011
GCV for loess: 2.55798352175634e-05
AIC for loess: -1.94600566665868
Filtering out interactions with A < 1
fitting ...
  |||   0%


@bioc-issue-bot
Copy link
Collaborator

Received a valid push on git.bioconductor.org; starting a build for commit id: c8e6289d082d2e4ff05ba334cc0ef3969cb52b43

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "TIMEOUT, skipped".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
ERROR before build products produced.

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scHiCcompare to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push on git.bioconductor.org; starting a build for commit id: bd735d46945b15ef9d2aa18d2d6528da3a1913fc

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
Linux (Ubuntu 24.04.1 LTS): scHiCcompare_0.99.4.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scHiCcompare to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@bioc-issue-bot
Copy link
Collaborator

Received a valid push on git.bioconductor.org; starting a build for commit id: 6091b7d763f154ea3292239664f08499e19cb99c

@bioc-issue-bot
Copy link
Collaborator

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings
on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
Linux (Ubuntu 24.04.1 LTS): scHiCcompare_0.99.5.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/scHiCcompare to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

@bioc-issue-bot bioc-issue-bot added OK and removed ERROR labels Dec 26, 2024
@lshep lshep removed the pre-review on bioconductor git and access to on demand build but not assigned reviewer until build report clean label Jan 2, 2025
@lshep lshep added the 2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place label Jan 2, 2025
@bioc-issue-bot
Copy link
Collaborator

A reviewer has been assigned to your package for an indepth review.
Please respond accordingly to any further comments from the reviewer.

@PeteHaitch
Copy link

PeteHaitch commented Jan 2, 2025

Hi @hamy12398,

I've been assigned as reviewer for scHiCcompare.
I'll aim to provide my review within the next 3 weeks.

Before I get started, please remove scHiCcompare_0.99.3.tar.gz from the git repository; the git repo should only contain the source files and not the built package.
Please also consider and address the output of R CMD check and BiocCheck::BiocCheck() as you will be likely asked to do so in any subsequent review.

Cheers,
Pete

@PeteHaitch
Copy link

Hi @hamy12398,

Thank you for submitting scHiCcompare to Bioconductor.

I've completed my checklist review of scHiCcompare and overall the package is in good shape and close to being ready for acceptance.
My main suggestions are improvements to the vignette.

A general question I had was around the input format ("five-column tab-separated text files in a sparse matrix format").
I don't work with HiC data, but I wondered if this is a de facto standard format or if there are other input formats that should be considered?

In my checklist review below I have separated the issues into Required and Recommended points that I would ask you to address before the package can be accepted.
Would you please provide line-by-line comments to my initial review so that I know what changes I'm looking for in my re-review.

Cheers,
Pete


Required

  • Why is the return class of scHiCcompare() a checkNumbers object rather than, say, a simple list? In fact, the 'Value' section of the ?scHiCcompare man page states that the returned object is a list and doesn't mention checkNumbers. I didn't find any documentation of the checkNumbers class, which is required if you are introducing a new (user-facing) class.
  • The BP_param argument of scHiC_bulk_compare() and scHiCcompare() should be named BPPARAM for consistency with other BiocParallel-backed functions in Bioconductor.
  • Use an informative name for the vignette rather than 'preciseTAD'. This title is what the user sees when searching for the vignette with browseVignettes().
  • Your citation file is empty and incorrectly named. Please fix. It should work when readCitationFile('inst/CITATION') is run; see https://contributions.bioconductor.org/citation.html.
  • The 'Usage' section of the documentation is different for the 3 data files; why?
  • I think the example data directories inst/MGs_example and inst/ODCs_example should be inst/extdata/MGs_example and inst/extata/ODCs_example, respectively; see https://contributions.bioconductor.org/data.html?q=data#raw-data-and-the-instextdata-directory.
  • R CMD check complains of a few things via NOTES (see below) that should be addressed:
    • usethis::use_mit_license() will address License stub is invalid DCF.
    • R CMD check tells you how to fix some of the 'no visible global function definition for XXX' issues. However, others are related to the use of dplyr, tidyr or data.table and can be addressed via utils::globalVariables() or an alternative described in https://r-pkgs.org/package-within.html#echo-a-working-package.
$ R_ENVIRON_USER=~/.Renviron.bioc R CMD check scHiCcompare_0.99.5.tar.gz 
<snip>
* checking DESCRIPTION meta-information ... NOTE
License stub is invalid DCF.
* checking package subdirectories ... NOTE
Found the following CITATION file in a non-standard place:
  inst/CITATION.txt
Most likely ‘inst/CITATION’ should be used instead.
* checking R code for possible problems ... NOTE
.all_progressive_pooling: no visible global function definition for
  ‘tail’
.randomize_IFs: no visible global function definition for ‘rnorm’
GMM_layer: no visible global function definition for ‘shapiro.test’
RF_impute.outrm.schic: no visible global function definition for
  ‘boxplot.stats’
RF_impute.outrm.schic: no visible global function definition for
  ‘na.omit’
RF_impute.outrm.schic: no visible global function definition for
  ‘aggregate’
RF_impute.outrm.schic: no visible binding for global variable
  ‘Single_cell’
RF_impute.outrm.schic: no visible binding for global variable ‘IF’
RF_process: no visible global function definition for ‘boxplot.stats’
RF_process: no visible global function definition for ‘na.omit’
RF_process: no visible global function definition for ‘aggregate’
RF_process: no visible binding for global variable ‘Single_cell’
RF_process: no visible binding for global variable ‘IF’
best_A: no visible global function definition for ‘is’
best_A: no visible global function definition for ‘quantile’
differential_result_plot: no visible binding for global variable
  ‘adj.M’
differential_result_plot: no visible binding for global variable ‘D’
differential_result_plot: no visible binding for global variable
  ‘Difference.cluster’
find.collinear: no visible global function definition for ‘cor’
mice.rf_impute: no visible global function definition for ‘quantile’
mice.rf_impute: no visible global function definition for ‘IQR’
mice.rf_impute: no visible global function definition for ‘na.omit’
mice.rf_impute: no visible global function definition for ‘aggregate’
plot_HiCmatrix_heatmap: no visible global function definition for
  ‘colorRampPalette’
plot_imputed_distance_diagnostic: no visible binding for global
  variable ‘IF’
plot_imputed_distance_diagnostic: no visible binding for global
  variable ‘Group’
pools_impute : process_pool: no visible global function definition for
  ‘na.omit’
pools_impute : process_pool: no visible binding for global variable
  ‘Single_cell’
pools_impute : process_pool: no visible binding for global variable
  ‘IF’
print.checkNumbers: no visible global function definition for ‘na.omit’
read_files: no visible global function definition for ‘read.delim’
scHiC_bulk_compare: no visible global function definition for ‘bpparam’
scHiCcompare: no visible global function definition for ‘bpparam’
scHiCcompare: no visible binding for global variable ‘cell_id’
scHiCcompare: no visible binding for global variable ‘region1’
scHiCcompare: no visible binding for global variable ‘region2’
scHiCcompare: no visible binding for global variable ‘IF’
scHiCcompare : <anonymous>: no visible global function definition for
  ‘write.table’
scHiCcompare: no visible global function definition for ‘write.table’
scHiCcompare_impute: no visible global function definition for
  ‘boxplot’
scHiCcompare_impute: no visible binding for global variable
  ‘Single_cell’
scHiCcompare_impute: no visible binding for global variable ‘IF’
withoutNorm_hicTable: no visible global function definition for ‘:=’
withoutNorm_hicTable: no visible binding for global variable ‘A’
Undefined global functions or variables:
  := A D Difference.cluster Group IF IQR Single_cell adj.M aggregate
  boxplot boxplot.stats bpparam cell_id colorRampPalette cor is na.omit
  quantile read.delim region1 region2 rnorm shapiro.test tail
  write.table
Consider adding
  importFrom("grDevices", "boxplot.stats", "colorRampPalette")
  importFrom("graphics", "boxplot")
  importFrom("methods", "is")
  importFrom("stats", "D", "IQR", "aggregate", "cor", "na.omit",
             "quantile", "rnorm", "shapiro.test")
  importFrom("utils", "read.delim", "tail", "write.table")
to your NAMESPACE file (and ensure that your DESCRIPTION Imports field
contains 'methods').
</snip>
  • BiocCheck::BiocCheck() similarly reports some issues. Some are recommendations about coding style and removing redundancy, and I encourage you to consider these points and the advice offered, but for the following specific points I ask you to follow the advice or explain why you haven't:
    • NOTE: 'LazyData:' in the 'DESCRIPTION' should be set to false or removed
    • NOTE: Avoid 'suppressWarnings'/'*Messages' if possible

Recommended

  • The vignette is the first thing many potential users will look at, but it takes a long time to get to using the package on some example data. I'd suggest re-organising the content so that the 'hands-on' stuff comes more towards the beginning of the vignette and the examples of other ways to load in data come later in the vignette.
  • Related, the 'Input' section of the vignette should make clear which code chunks are not evaluated in the vignette. As they are written, these unevaluated chunks also wouldn't work if a user copy-pasted it, e.g., because GEOquery, etc. are never loaded/attached via library(GEOquery).
  • In the vignette, I generally recommend not repeating information that is already availabe in the man page documentation. Instead, have the vignette focus on the bigger picture. Most particularly, I would leave the description of function arguments to the man pages (otherwise if you make changes you have to remember to update them in multiple places).
  • Strongly recommend proofreading the rendered vignette to check it appears as you intend (especially to check formatting of markdown).
  • Although BioC discourages use of packages not on CRAN or BioC (and forbids them as package dependencies), I think it's okay to use them in the vignette. But FYI you can use BiocManager::install("immunogenomics/harmony") instead of requiring the user to install devtools.
  • In the vignette you write, "the ‘.txt’ files need to be saved in tab-separated columns and no row names, column names, or quotes around character strings with the example format below.". Perhaps it is clearer to show the next table as raw markdown output rather than as rendered table because the rendering obscures the point you are making about the formatting.
  • Strongly recommend adding captions to figures in the vignette via the fig.cap chunk option.
  • scHiCcompare() produces 3 figures, but when run interactively a user really only gets to see the last figure because they 'overwrite' each other in the interactive graphics device. Perhaps you can return the 3 figures on a single device? The patchwork package is very helpful for this.
  • Clarification: For the 'MD plot' panels, does 'D' mean 'distance' (what it looks like) or 'difference' (what it usually means)?
  • Consider more selective imports into your NAMESPACE using importFrom rather than importing all functions from your dependencies using import.
  • Consider adding a package-level man page; see https://contributions.bioconductor.org/docs.html#package-level-documentation.
  • Good job adding unit tests. I recommend running covr::report() to get a report about code coverage of your package to identify areas that may need testing.

@PeteHaitch
Copy link

Hi @hamy12398,

FYI I will be on leave until Feb 17, so please don't expect a reply from me until after I return to work.

Cheers,
Pete

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place OK
Projects
None yet
Development

No branches or pull requests

4 participants