Skip to content
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.

Commit

Permalink
merge upstream to resolve use_mkldnn issue (#557)
Browse files Browse the repository at this point in the history
* fix link for gluon model zoo (#13583)

* Fix exception handling api doc (#13519)

* Fix exception handling api doc

* Update waitall api doc

Co-Authored-By: anirudh2290 <[email protected]>

* add cpp example inception to nightly test (#13534)

* add inception test

* fix max iter for mlp

* rename and add comment

* rename epoch num

* Add notes about debug with libstdc++ symbols (#13533)

* Add imresize and copyMakeBorder to mx.image (#13357)

* Add imresize API to docs

* address comments

* copyMakeBorder

* [MXNET-1253] fix control_flow_op (#13555)

* fix control_flow_op

* change type for M

* add test for sparse where op

* Add Intel MKL blas to Jenkins (#13607)

* add mkl blas to Jenkins

* add mkl install script

* fix bug in mkl script

* remove python2 ut and add cpu-mkl node

*  #13385 [Clojure] - Turn examples into integration tests (#13554)

* fix the Float not showing correctly problem (#13617)

Merge this PR for 1.4.x

* [MXNET-1155] Add scala packageTest utility (#13046)

* [MXNET-1155] Add scala packageTest utility

* Clean up utility

* Safe change directory in Makefile for scala

* mvn install file instructions with details

* [MXNET-1224]: improve scala maven jni build and packing. (#13493)

Major JNI feature changes. Please find more info here: https://cwiki.apache.org/confluence/display/MXNET/Scala+maven+build+improvement

* [MXNET-1225] Always use config.mk in make install instructions (#13364)

* Always use config.mk in make install instructions

* Specify Cuda 0 for ubuntu with mkldnn

* Scala install doc avoid build_from_source

Minor doc fixes

* Fix build_from_source CMake usage

* CPP Install Instruction with CMake

* Use cmake out of source build

* Fix warning in waitall doc (#13618)

* Optimize C++ API (#13496)

* Optimize C++ API

Pass parameter with reference instead of value.
Add const as well as it is not changed.

* fix docs/architecture/overview.md

Fix BinaryShapeFunction typedef
Add a right brace for SmoothL1Shape_

* fix quantize pass error when the quantization supported Op are excluded in the model (#13596)

* Scripts for building dependency libraries of MXNet (#13282)

* openblas script

* ps-lite dependencies

* USE_S3 dependencies

* image libraries

* license

* add batch norm test (#13625)

* add batch norm test

* fix formatting

* use out_arr as input

* fix typo

* remove const

* use ptr

* eval ptr

* Set install path for libmxnet.so dynamic lib on Mac OS (#13629)

* Fix the bug of BidirectionalCell (#13575)

* Fix the bug of BidirectionalCell

I did hybridize( ) and pass "valid_length" to the unroll( ) function of BidirectionalCell, then returned AssertionError in line 79. Because symbol.split( ) return a symbol but not a symbol list. Result in the length of inputs dont equal parameter "length"  when call unroll( )  to compute r_outputs and r_states.

* add a test for BidirectionalCell

* Fix the bug of BidirectionalCell

I did hybridize( ) and pass "valid_length" to the unroll( ) function of BidirectionalCell, then returned AssertionError in line 79. Because symbol.split( ) return a symbol but not a symbol list. Result in the length of inputs dont equal parameter "length"  when call unroll( )  to compute r_outputs and r_states.

* fix test_bidirectional_unroll_valid_length( )

Fix the error of parameter.

* Fix the bug of BidirectionalCell

I did hybridize( ) and pass "valid_length" to the unroll( ) function of BidirectionalCell, then returned AssertionError in line 79. Because symbol.split( ) return a symbol but not a symbol list. Result in the length of inputs dont equal parameter "length"  when call unroll( )  to compute r_outputs and r_states.

* fix test_bidirectional_unroll_valid_length( )

* Feature/mkldnn static (#13628)

* Revert "Revert "Feature/mkldnn static 2 (#13503)" (#13540)"

This reverts commit a3eca5f5c96eed0bc29bd4e58e470997091a1fb3.

* include headers on mkldnn lib

* retrigger

* retrigger

* build config for maven and pip (#13556)

* config for pip

* symbol whitelist

* maven build config

* Fix for import mxnet taking long time if multiple process launched (#13602)

* https://github.com/apache/incubator-mxnet/issues/12255
doing import mxnet in multiple processes take very long time.
Details : #12255
One of the reason we have OMP tuning code which iterates to find OMP
tune overhead. We are reducing this iteration count to reduce the
overehead of tuning code.
Also, We added an environment variable where users can set the number
of cores that should be used to determine tuning.

* cpplint fix

* Adding new environment variable: MXNET_USE_NUM_CORES_OPERATOR_TUNING to doc

* fixing formatting in doc

* Add reshape op supported by MKL-DNN (#12980)

* Add reshape op supported by MKL-DNN

* fix build issue

* fix lint

* fix lint

* fix lint

* fix lint

* fix lint

* fix lint

* fix white space

* add unit test

* merge if blocks

* Improve dev_menu usability, local build and virtualenv (#13529)

* Improve dev_menu, add build command and virtualenv creation with local builds for easy testing

* Update dev_menu.py

Co-Authored-By: larroy <[email protected]>

* Cuda off by default, use ccache

* address CR

* [Clojure] Correct the versions in the README so they correspond to the latest maven.org release (#13507)

* Correct the versions so they correspond to the latest maven.org release

* trigger build

* feedback from @kohr-h

* Optimization of metric evaluation (#13471)

* Change argsort to argpartition

* Global statistics in metrics

* Fix lint

* Fixes from review

* Trigger

* Fixes from review, fix to F1, MCC and perplexity metrics,
added test for global stats

* Fix lint

* Fix compatibility with Python 2

* Revert "Feature/mkldnn static (#13628)" (#13638)

This reverts commit 5bcf2bd6e8b48fa27bfcfdafd06401ec2d28978b.

* support mkl log when dtype is fp32 or fp64 (#13150)

* support mkl log when dtype is fp32 or fp64

* remove macro

* ensure data size less than or equal MKL_INT_MAX

* code specification

* fix indent

* for retrigger

* [MXNET-1209] Tutorial transpose reshape  (#13208)

* transpose tutorial

* Adding Anirudhs comments

* Update tutorial with some more examples

* Adding links

* Fixing the links, adding more examples

* Update reshape_transpose.md

* Fixing spelling mistakes

* Updating image resolution

* Adding Simon's comments

* Small fixes

* Update reshape_transpose.md

* Update reshape_transpose.md

* empty commit

* empty commit

* updated reference to Apache MXNet (#13645)

* Complimentary gluon DataLoader improvements (#13606)

* init

* add tests

* doc

* lint

* fix openmp

* Improve CCache handling (#13456)

* Remove gitignore entries

* Modify Makefile

* Modify user permissions

* Add new ccache wrapper function

* Change PATH rewrite to a different one to resolve CUDA issues

* Add ccache to gpu cmake

* Enable ccache for every build

* Set permissions for arm dockerfiles

* Disable ccache for ASAN

* Remove g++-8 ccache redirect

* Update Android Dockerfiles for user permissions

* Fix ASAN compiler typo

* Remove sanity for speed

* Move build dir creation in android armv8

* Revert "Remove sanity for speed"

This reverts commit e8386a774dafe96337930b9cac36cb24fc36585e.

* Add ccache for NVCC in Makefile

* [MXNET-918] Random module (#13039)

* introduce random API

* revert useless changes

* shorter types in APIDoc gen code

* fix after merge from master

* Trigger CI

* temp code / diag on CI

* cleanup type-class code

* cleanup type-class code

* fix scalastyle

* Fix incorrect delete in MXExecutorReshape exception handling (#13376)

* Fix bad delete.

Delete the pointed-to handle on cleanup, not the location of the handle itself. Also don't delete it if we didn't set it in the first place.

* Remove unusued 'exec' var from MXExecutorBindEX.

* [MXNET-1251] Basic configuration to do static-linking (#13621)

* Basic configuration to do static-linking

* update build script and place it in the install part

* clean up the code further

* revert maven into build-from-source

* add curl to deps

* [MXNET-1195] Cleanup Scala README file (#13582)

* Updated the Scala-Readme with upto-date information

* Updated the header

* Removed redundant build status

* Minor formatting changes

* Addressed the PR feedback

* Added section on Scala training APIs

* Removed mention of deprecated Model API

* scripts for building libmxnet binary and wheel (#13648)

* add script for making all dependencies

* tools for building pip package

* build scripts for lib and wheel

* [MXNET-1083] Add the example to demonstrate the inference workflow using C++ API (#13294)

* [MXNET-1083] Add the example to demonstrate the inference workflow using C++ API

* [MXNET-1083] Add the example to demonstrate the inference workflow using C++ API

* Updated the code to address the review comments.

* Added the README file for the folder.

* Addressed the review comments

* Addressed the review comments to use argmax and default mean values.

* Update MKLDNN_README.md (#13653)

* Support Quantized Fully Connected by INT8 GEMM (#12922)

* add quantized fully connect support

* disable qfc cpu case since s8u8s32 is only supported by MKL BLAS library

* retrigger to ci testing

* move implementation to cc file and add  STORAGE_TYPE_ASSIGN_CHECK

* fix typo bug

* retrigger the ci test

* fix typo bug

* retrigger ci

* retrigger the ci test

* retrigger the ci

* retrigger the ci test

* retrigger ci test

* fix indent issue

* retrigger the ci

* retrigger the ci test

* add verbose message

* update log message

* using range for loop

* using for auto range

* enable MKL BLAS ci test

* fix typo issue

* use TYPE_ASSIGN_CHECK

* retrigger the ci

* add build fix for Scala/Java build (#13655)

* Fix Jetson compilation (#13532)

* remove omp which can cause ssd accuracy variance (#13622)

* Revert "[MXNET-43] Fix Jetson compilation" (#13665)

* Revert "remove omp which can cause ssd accuracy variance (#13622)"

This reverts commit 655f1c6f7a0706dd622f73db9af2e6df895ca213.

* Revert "Fix Jetson compilation (#13532)"

This reverts commit 48e25c4cae355753dd96ea7afe004bf78e0719e4.

* Fix Jetson compilation (#13666)

* turn on Sphinx warnings as errors (#13544)

* turn on warnings as errors

* move warnings as error logic to build_all_version

* fix typo in comment

* add warning as error option for docs pipeline

* bump ci to test again; use this chance to add notes on this feature

* fix bugs in image.py docs

* Update CODEOWNERS, add Pedro Larroy. (#13579)

* Revert "Revert "[MXNET-43] Fix Jetson compilation" (#13665)" (#13672)

This reverts commit 3433776dac7be75928082bbc1d552fca248fb8e8.

* Accelerate DGL csr neighbor sampling (#13588)

* Speedup and fix bug in dgl_csr_sampling op

* Update dgl_graph.cc

* simplify functions.

* avoid adding nodes in the last level in the queue.

* remove a hashtable lookup in neigh_pos.

* reduce a hashtable lookup in sub_ver_mp.

* merge copying vids and layers.

* reduce hashtable lookup when writing to output csr.

* fix a bug.

* limit the number of sampled vertices.

* fix lint.

* fix a compile error.

* fix compile error.

* fix compile.

* remove one hashtable lookup per vertex and hashtable iteration.

* remove queue.

* use vector for neigh_pos.

* fix lint

* avoid init output arrays.

* fix tests.

* fix tests.

* update docs.

* retrigger

* retrigger

* [MXNET-1252][1 of 2] Decouple NNVM to ONNX from NNVM to TenosrRT conversion (#13659)

* fix unpicklable transform_first on windows (#13686)

* Move the debug output message into MXNET_MKLDNN_DEBUG (#13662)

* NEWS.md backport from v1.4.x to master (#13693)

* merge NEWS.md from 1.4.x to master

* NEWS.md backport from v1.4.x to master

* Fallback to dense version for grad(reshape), grad(expand_dims) (#13599)

* fallback to dense version for grad(reshape), grad(expand_dims)

* add _backward_reshape gpu version

* reshape test case comments

* fix gpu test

* remove mkldnn support for _backward_reshape

* ONNX export: Add Flatten before Gemm (#13356)

* Add Flatten before Gemm

* ONNX export test: Allow multiple inputs in forward pass

* ONNX export: Test for fully connected

* [MXNET-1164] Generate the document for cpp-package using Doxygen (#12977)

* Adding cpp-package directory to the Doxyfile. Updating the index.md file in c++ api directory.

* Updating the link to classes in C++ API to point to correct html file.

* Updated the links to use relative paths.

* Removed the extra slash character in the url

* Excluded the 3rdparty folder as per the review comment.

* Update git clone location to apache github (#13706)

* Add timeout/retry logic to docker cache download (#13573)

* Added timeout/retry (linear backoff) to docker cache download

* Units changed, as time.sleep takes seconds as argument

* Improved error handling

* Using retry decorator

* Added retry decorator to _login_dockerhub method

* Fixed wrong import

* Fix NDArray ToDLPack Bug (#13698)

* Added javadocs and improved example instructions (#13711)

* Rearrange tests written only for update_on_kvstore = True (#13514)

* Update test_gluon_trainer.py

* Update test_gluon_trainer.py

* test

* Update mshadow to support batch_dot with fp16. (#13716)

* fp16 dot

* update mshadow

* update mshadow

* update mshadow

* Fix the quantization script to support Python2 (#13700)

* fix the quantization script to support python2

* Fix comments, fix similiar issue in imagenet_inference.py

* ONNX test code cleanup (#13553)

* ONNX test code cleanup

* Make tests use the common test case list

* Remove import test_cases

* Make Gluon backend rep common

* Partially enable broadcast tests

* Common function to populate tests

* Make backend common

* test models

* Test nodes

* ONNX export: Test for fully connected

* Edit CI scripts mxnet export test cleanup

* Further cleanup backend tests

* README

* Some corrections

* test case format for test_models

* update social media section (#13705)

* script for installing gpu libraries and build tools (#13646)

* Port of scala infer package to clojure (#13595)

* Port of scala infer package to clojure

* Add inference examples

* Fix project.clj

* Update code for integration tests

* Address comments and add unit tests

* Add specs and simplify interface

* Minor nit

* Update README

* update code owner (#13737)

* AdamW operator (Fixing Weight Decay Regularization in Adam) (#13728)

* tests

* remove optimizer and move op to contrib

* rename parameter

* ONNX import/export: Add missing tests, ONNX export: LogSoftMax (#13654)

* Logsoftmax, missing tests

* Support multiple outputs in Gluon backendrep

* Remove repeated unsqueeze test

* Allow multiple output support

* ONNX test code cleanup - part 2 (#13738)

* Common test caller

* Remove incorrect comment

* Make corrections to CI

* fix ci script

* Update basic_layers.py (#13732)

* ONNX import: Hardmax (#13717)

* ONNX import: Hardmax

* Fix lint errors

* add github link for issue with reshape

* gluon docfix (#13631)

* Fixes for trainer with update_on_kvstore=False (#13721)

* add clarification for param_dict

* more tests for dist kvstore

* more unittests

* fix a bug

* more dist exception test

* revert optimizer list

* fix bug and comment

* fix doc rendering and lint

* add invalid sched test

* fix website

* trigger

* update doc

* Reorder module import orders for dist-kvstore (#13742)

* Reorder module import orders for dist-kvstore

* more code comments

* CMake: Enable installation of cpp-package headers (#13339)

* Allow CMake based installation of cpp-package

* Add installation of missing nnvm headers

* Add documentation as to where public headers will be installed

* disable error checking when building old versions (#13725)

* Integrate MKLDNN Conv1d and support 3d layout (#13530)

* add 3d layout support for MKLDNN Conv and Activation

* fix lint

* code refactor

* add testcase for group1 conv and skip quantization for conv1d

* fix lint

* avoid conv1d quantization

* code refactor and add activation ut

* del todo

* Making MKL-DNN default on MXNet master (#13681)

* mkldnn is default makefile and explicitly turn off for buidls

* add endif

* retrigger

* retrigger

* build mkldnn as static lib

* update makefile to statically build mkldnn

* build static mkldnn

* fix static name

* fix static name

* update static for mac

* rename mkldnn dep in ci

* remove moving mkldnn dynamic lib

* retrigger

* remove commented code

* retrigger

* remove mkldnn dnaymic for unitest

* retrigger

* retrigger

* force static for mkldnn lib

* turn of mkldnn on arm builds

* remove dynamic mkldnn bind

* update jenkins to use only mkldnn

* remove last flag

* turn mkldnn by default on mac

* move mkldnn files for GPU MKLDNN build

* copy lib mxnet in gpu build

* only link windows

* add mkldnn.mk

* try force linking

* retrigger

* retrigger

* remove mkldnn dynanmic check

* use ifndef

* remove test mkldnn install

* fix spacing

* fix index

* remove cp of mkldnn since statically linked

* add libmkldnn.a to list of files to pack

* include mkl_ml

* add mkldnn to pack

* add libiomp to ci pack

* move static libs

* fix typo

* pack mkldnn

* retrigger

* add linux artifacts

* move libmkldnn in gpu cmake build

* move libmkldnn and libiomp5 on gpu workspace

* move linked files

* fix typo

* fix typo

* add artifacts for tensorrt

* move mkldnn lib in scala build

* move mkldnn lib on cpu scala

* create dir for binding

* rename libmkldnn in scala

* move mklml dep in scala builds

* move mkl to another linked folder

* move libmkl to another dir

* add libmklml

* move mkldnn

* move mkldnn on centos

* specify new dynamic path

* retrigger

* remove mkldnn dynamic lib

* remove moving mkldnn artifact

* add ld path

* retrigger

* Revert "remove moving mkldnn artifact"

This reverts commit 16cca196e9e1ad92db74f4e8a01b3b052076d268.

* Revert "remove mkldnn dynamic lib"

This reverts commit d51043622d4ef7fcb95aff6a3e84d91ab71b48c9.

* update makefile

* Revert RPATH change and trigger CI

* correcting use-mkldnn flags for two tests

* mkldnn default on linux for starters

* reverting naming rules of pack_lib

* adding mkldnn=0 flags to centos non mkldnn builds

* adding mkldnn=0 flags to ubuntu gpu non mkldnn builds

* removing mkldnn binary operation for ubuntu gpu cmake non mkldnn build

* removing mkldnn binary operation for centos non-mkldnn unittest

* adding explicit USE_MKLDNN=0 flags for clang builds

* adding explicit USE_MKLDNN=0 flags for cpu ubuntu builds

* removing mkldnn binaries from non mkldnn builds scala gpu

* adding explicit flag mkldnn=0 for tensorrt gpu build

* adding explicit flag mkldnn=0 for ubuntu cmake asan

* adding centos cpu mkldnn tests to CI

* adding CentOS GPU MKLDNN build and unittest

* not keeping mkldnn default for mac os

* setting mkldnn default for x86_64 only

* running docs with mkldnn=0 flag

* removing CentOS CPU Scala MKLDNN test

* setting mkldnn default for x86_64 only

* not making mkldn default on windows

* removing Centos MKLDNN tests from CI

* retrigger

* retrigger

* retrigger

* use relative links; update links (#13741)

* [MXNET-1231] Allow not using Some in the Scala operators (#13619)

* add initial commit

* update image classifier as well

* create Util class make Some conversion

* add test changes

* adress Comments

* fix the spacing problem

* fix generator base

* change name to Option

* fix bug in profiler tutorial when using cpu (#13695)

try except approach only goes to ctx=mx.gpu() because test_utils.list_gpus() at least returns empty array and do not producing error

* local docs build feature (#13682)

* make ROIAlign support position-sensitive pooling (#13088)

* make ROIAlign support position-sensitive pooling

* add unittest for RoIAlign op

* fix ccplint error

* fix python3 compability for unittest

* change OMP for better performance

* delete blank line to trigger CI

* add shape check when position_sensitive is true

* fix the typo

* typo: shuold -> should

* remove private() clause in omp statement

* add examples and fix the dependency problem (#13620)

* add examples and fix the dependency problem

* add Nightly run and optimized script

* add explanation for the line

* Update Adam optimizer documentation (#13754)

* Less cudaGet/SetDevice calls in Gluon execution (#13764)

* Remove unnecessary cudaGetDevice/cudaSetDevice calls

* Fixes for the DeviceGuard

* Retrigger CI

* Fix for possible invalid device ordinal when using DeviceStore while
driver is unloading

* Fix for RTC when the driver API call is the first call

* Added DeviceStore to pooled engine

* Scope requests so it's not needed for dev_menu (#13771)

* Fix USE_MKLDNN check in Makefile (#13775)

* fix makefile

* change make/config.mk

* add comments

* retrigger ci

* fix c complier to clang (#13778)

* Fixed mailing list addresses (#13766)

* [MXNET-1255] update hybridize documentation (#13597)

* update hybridize documentation

* address review comments

* improve doc

* address comments

* address comments

* [MXNET-244] Work around likely compiler bug on nested inlines and temporary acces… (#13535)

* Work around likely compiler bug on nested inlines and temporary access to stream

* Don't compile khatri_rao tests if we don't have LAPACK

* Address CR comment

* Use curl to download sample data instead of wget. (#13761)

* fix bipartite match memory corruption (#13727)

* remove attributs clear on TRT nodes for GetOptimizedSymbol (#13703)

* Add CPU test coverage and refine cmake builds (#13338)

* add license (#13793)

* [MXNET-862] Basic maven jenkins pipeline (#13450)

* Jenkins Publish Nightly Maven

Progress

* Seperate Build, Test, and Deploy Stages with parallel

* Re-organize Scala maven build (#13626)

* Re-organize scala maven build

1. Automatically detect which platform to build for scala.
2. Remove platform dependend submodules
3. Fix cyclic module dependencies
4. Fix scalatype style check
5. Now mvn can be executed in submodule
6. Maven build can be executed from any directory not only in root project
7. Checkin javah header file, and use verify task to detect native API changes
8. Improve incremental build performance
9. Remove unittest and integration-test profile, use proper task instead
10. Delete generated scala file during maven clean.

* Redo maven deploy related tasks.

1. Removed maven release plugin.
2. Make maven build friendly to CI, allow cli override version.
3. Moved gpg signing to deploy stage.
4. Created a separeated deploy module.
5. Updated Makefile to new maven build change.
6. Remove unused nexus-staging-plugin
7. Added nightly and staging profile for CI.

* Support mkldnn for Scala.

* Add extra header file to export for error checking (#13795)

* add extra header file to include

* fix sanity check

* fix sanity

* move c_api_common.h to include folder

* fix build error

* keep c_api_common.h internal

* strip out error handling API into a separate header

* consolidate comment into one paragraph per review

* remove unnecessary include

* fix redirection issues; set default version to master (#13796)

* [MXNET-898] ONNX import/export: Sample_multinomial, ONNX export: GlobalLpPool, LpPool (#13500)

* ONNX import/export: Sample_multinomial

* ONNX export: GlobalLpPool, LpPool

* Handle default p_value

* Add tests for multinomial, lppool, globallppool

* add a comment about shape test

* whitelist symbols for using MXNet error handling externally (#13812)

* fix for params with no dims in onnx (#13413)

* fix for params with no dims

* fix

* fix

* retrigger build

* test added

* retrigger CI

* retrigger ci

* Remove semicolon in libmxnet.sym file (#13822)

* Remove semicolon in libmxnet.sym file

* empty commit to trigger CI

*  Clojure example for fixed label-width captcha recognition  (#13769)

* Clojure example for fixed label-width captcha recognition

* Update evaluation

* Better training and inference (w/ cleanup)

* Captcha generation for testing

* Make simple test work

* Add test and update README

* Add missing consts file

* Follow comments

* Update LICENSE File with subcomponents (#13808)

* Update LICENSE File with subcomponents

* Fix JavaScript licenses

* Dockerfiles for Publish Testing (#13707)

* Add new Maven build for Scala package (#13819)

* clean up build

* fix minor issue and add mkldnn

* fix mx_dist problem

* fix clojure build

* fix skip test

* ONNX ops: norm exported and lpnormalization imported (#13806)

* ReduceL1, l2 export, lpnormalization import added

* fix

* fix

* fix

* fix

* remove useless code (#13777)

* Fixing a symlink issue with R install (#13708)

* fix minor indentation (#13827)

* [MXNET-880] ONNX export: Random uniform, Random normal, MaxRoiPool (#13676)

* ONNX export: Random uniform, Random normal

* ONNX export: MaxRoiPool

* tests for maxroipool, randomnormal, randomuniform

* onnx export ops (#13821)

* onnx export ops

* retrigger ci

* retrigger ci

* fix

* [MXNET-1260] Float64 DType computation support in Scala/Java (#13678)

* Added Float64 as a supported datatype in NDArray

* Added unit tests for Float64 in NDArray

* Fix for failing Clojure unit tests

* Added Float and Double as MX_PRIMITIVES for computation in Scala

* Trying out second approach --> Private Impl methods with generic signature, and public methods calling the Impls

* Fixed errors in *= method

* Added Float64 in IO.scala and DataIter.scala

* Added another testcase for IO.DataDesc creation

* Fixed failing CI

* Added Float64 in Predictor class

* Added Float64 in Classifier class

* Added Double as a possible return type to : classifyWithNDArray

* Added unit tests for Classifier and Predictor.scala classes for Float64/Double

* Approach 3 --> Using a trait to mirror Float and Double in Scala

* Added comments on MX_PRIMITIVES.scala

* Added Float64/Double support for inference in ImageClassifier APIs

* Added unary- and compareTo in MX_NUMBER_LIKE

* Renamed MX_NUMBER_LIKE to MX_PRIMITIVE_TYPE

* Fixed linting issue

* Now specifying dType from the available data in copyTo and MXDataIter.scala for creating a new DataIterator

* Add primitives support handling to the generator for proper conversion

* Reduced code duplication in classify method in Classifier.scala

* Fix infer package for new signatures and address some bugs

* Removed code duplication in getPixelsArray

* remove debugging

* Changed classifyWithNDArray method in Classifier.scala

* Removed code duplication in predictImpl

* Satisfying lint god _/\_

* Fixed failing PredictorSuite test

* Renamed MX_FLOAT to Camel case

* Revert "Renamed MX_FLOAT to Camel case"

This reverts commit 9d7c3ce6f9c4d6ed2c46041a02e23c0f1df8dfe5.

* Added an implicit conversion from int--> float to support int operations in NDArrays. (These ops were already supported in the previous versions)

* Added Float64 as a training option to ImClassification Suite. Also added integration tests for it

* Satisfy Lint God _/\_

* Added Float64 support in Java NDArray

* Added Float64 support in Java's Predictor API

* Added yours truly to the Contributors list

* Added method comments on Predictor.predict with Array[Double] as a possible input

* Added method comments explaining what MX_PRIMITIVE_TYPE is

*  Fixed errors cause by rebasing with master

* Added licences to the files

* [MXNET-1263] Unit Tests for Java Predictor and Object Detector APIs (#13794)

* Added unit tests for Predictor API in Java

* Added unit tests for ObjectDetectorOutput

* Added unit tests for ObjectDetector API in Java

* Addressed PR comments

* Added Maven SureFire plugin to run the Java UTs

* Pom file clean up -- moved surefire plugin to parent pom.xml

* Renamed skipTests to SkipJavaTests

* Fix scala doc build break for v1.3.1 (#13820)

* Fix doc build break for v1.3.1

* ignore errors on v1.3.x during scala docs gen

* Remove MXNET_STORAGE_FALLBACK_LOG_VERBOSE from test_autograd.py (#13830)

* Add Local test stage and option to jump directly to menu item from commandline (#13809)

* Removes unneeded nvidia driver ppa installation (#13814)

* Improve license_header tool by only traversing files under revision c… (#13803)

* Improve license_header tool by only traversing files under revision control

* use HEAD instead of master for CI

* Disabled flaky test (#13758)

* change to compile time (#13835)

* fix Makefile for rpkg (#13590)

* fix Makefile for rpkg

* update R and roxygen2 requirements

* add roxygen requirement

* add roxygen requirement

* [CI] Prevent timeouts when rebuilding containers with docker. (#13818)

* Prevent timeouts when rebuilding containers with docker.
Increase timeout from 120 to 180 for pipelines

* Increase docker cache timeout

* Increase timeout also for docs

* limit parallel builds to 10

* Code modification for  testcases of various network models in directory example (#12498)

* example testcase modified

* rcnn file add

* license add

* license init

* CI test trigger

* rcnn modify give up

* trigger

* modify for better user experience

* change the default parameter to xpu=None

* Update bdk_demo.py

* Update fcn_xs.py

* Update test.py

* Update train.py

* Update bdk_demo.py

* Update bdk_demo.py

* modify review comments

* refine

* modify Readmes according to the changed code.

* finetune READMEs

* re-trigger ci

* re-trigger ci twice

* Add copyrights for third party licenses to license file (#13851)

* Fix Tree Reduction on new instance type p3dn.24xlarge (#13852)

* add fallback for gpu topology detection using CUDA 9.2

* add fallback for gpu topology detection using CUDA 9.2

* add log

* update 3rdparty to master

* add fallback for gpu topology detection using CUDA 9.2

* add log

* update 3rdparty to master

* bring 3rdparty packages to upstream/master

* rebase to master

* Update gpu_topology.h

* [Clojure] package infer tweaks (#13864)

* change object detection prediction to be a map

* change predictions to a map for image-classifiers

* change return types of the classifiers to be a map
- add tests for base classifier and with-ndarray as well

* tweak return types and inputs for predict
- add test for plain predict

* updated infer-classify examples

* adjust the infer/object detections tests

* tweak predictor test

* Feedback from @kedarbellare review

* put scaling back in

* put back predict so it can handle multiple inputs

* restore original functions signatures (remove first)

* Modifying clojure CNN text classification example (#13865)

* Modifying clojure CNN text classification example

* Small fixes

* Another minor fix

* adding tolerance to flaky test (#13850)

* adding tolerance

* retrigger ci

* retrigger ci

* Julia v0.7/1.0 support and drop v0.6 support (#12845)

* Fix cpp examples build on Mac. (#13826)

This is a regression of addning @rpath name to libmxnet.so on Mac,
example executable is not able to find libmxnet.so anymore.
Add @rpath search path to fix this issue.

* Fix launch bounds in spatial transformer (#13188)

* Fix launch bounds in spatial transformer

* Adding explanation in comment.

* Update example scripts classpath. (#13849)

* [MXNET-1177]Adding Scala Demo to be run as a part of Nightly CI (#13823)

* Adding Scala Demo to be run as a part of Nightly CI

* Addressed PR feedback : making a profile to fetch nightly jars only on CI

* Changed name from scalacidemo to scala_ci_demo

* Synchronized the scala-demo and java-demo for nightly CI runs

* Pruned the maven command to simply maven install

* changed running from ./.sh to bash .sh to be consistent

* Add CODEOWNERS for Julia package (#13872)

* fix ssd quantization script error (#13843)

* fix ssd quantization script error

* update readme for ssd

* move quantized SSD instructions from quantization/README.md to ssd/README.md

* update ssd readme and accuracy

* update readme for SSD-vGG16

* Fix permissions of ci/docker/install/ubuntu_publish.sh (#13840)

* Avoid adding SegfaultLogger if process already has sig handler. (#13842)

In current implemenation, we override signal handler regardless if MXNET_USE_SIGNAL_HANDLER=1.
This breaks caller process behavior and cause process exit unexpectedly.
The example use case is libmxnet.so is loadded into java process via JNI or JNA. JVM will crash
due to SegfaultLogger.

In this PR, we will not register SegfaultLogger if there is a signal handler registered.

* fix the fetching GPU problem (#13889)

* Fix SN-GAN example doc (#13877)

* fix the wrong argument

* fix broken link

* update Spectral Normalization Code (#13868)

* update sn_code

* update sn_code

* Temporarily disable website testing (#13887)

* Fixed java benchmark failing error by fixing the classpath (#13891)

* Jenkins nightly maven with static build script and gpu (#13767)

* Added logging to GitHub commit status publishing (#13615)

* Add a test for SGLD optimizer with comparisons for set noise seeds. (#13762)

* [MXNET-703] Update to TensorRT 5, ONNX IR 3. Fix inference bugs. (#13310)

* [MXNET-703] Install CUDA 10 compatible cmake

This works around a CUDA 10 cmake issue documented here:
https://github.com/clab/dynet/issues/1457

This fix is temporary; once an updated cmake package is published to Ubuntu's
package repo it may be reverted.

* [MXNET-703] Update to TensorRT 5 ONNX IR 3. Fix inference bugs.

* [MXNET-703] Describe onnx opsets and major version

* Fix the order of error term's operands (#13745)

* fix the order of error term's operands

* address comments

* Add mkldnn OP for slice (#13730)

* add mkldnn slice

* fix lint

* fix lint

* mv SliceEx to matrix_op.cc

* fix lint

* optimize dispatch_mode

* retrigger ci

* fix indent

* fix bug in nag optimizer (#13683)

* fix bug in nag optimizer

```
grad += wd * weight
mom[:] += grad
grad[:] += self.momentum * mom
weight[:] += -lr * grad
```
This will minus wd*weight twice, but in`state = momentum * state + grad + wd * weight   weight = weight - (lr * (grad + momentum * state)) ` only minus once.

* fix bug in nag test

fix bug in nag test

* rewrite nag test

* rewrite nag

* fix nag with in-place operations

* fix nag with in-place operations

*  #13813 examples with opencv4/origami (#13813)

* Fix BatchNorm converter for CoreML when fix_gamma=True (#13557)

* beta doc fixes (#13860)

* Update profiler doc (#13901)

* Update c_api_profile.cc

* Update c_api_profile.cc

* Fix for test always returning true (#13911)

* Add error checking for cpp examples. (#13828)

* add ccache to docs build (#13832)

* Java install info update (#13912)

* updated java dependency

* update to duplicated java cpu

* java gpu update

* Updated java dependency version information

* Static build instruction for MXNet in general (#13914)

* update scripts and tutorial

* add the static test for scala package

* kill publish test

* fix build issue

* address comments

* julia: fix `argmax` for NDArray (#13871)

- fix 0-based index output to 1-based index

close #13786

* Support populating errors back to MXNet engine in callback (#13922)

* add an optional error_msg in engine on_complete callbcak

* use dmlc::Error struct to make error population extendable

* Fix document build (#13927)

* fix doc build

* Revert "Temporarily disable website testing (#13887)"

This reverts commit 9d4281271c871a938f1ac4ee55b218872031963d.

* test_ImageRecordIter_seed_augmentation flaky test fix (#12485)

* Moves seed_aug parameter to ImageRecParserParam and re-seeds RNG before each augmentation to guarantee reproducibilit

* Update image record iterator tests to check the whole iterator not only first image

* Version switching user experience improvements (#13921)

* fix version switching for anchors and search

* improved redirects

* fix bug for dev previews; remove hardcoded protocol

* Julia: fix filename quoting in docstring (#13894)

Quoting filename with backticks to prevent
markdown mis-rendering some of them with underscore.

* disable default MKLDNN for cross compilation (#13893)

* disable default MKLDNN for cross compilation

* adding temporary debug logs

* Julia: deprecate `mx.empty`, replace it with `UndefInitializer` (#13934)

In Julia 0.7+, constructing a uninitialized array is provided via
the APIs:
        - `Array{T,N}(undef, dims...)`
        - `Array{T,N}(undef, dims)`
        - `Array{T}(undef,   dims...)`
        - `Array{T}(undef,   dims)`

There is an API `mx.empty(dims...)` serving for this purpose.

This PR proposes that deprecating the original API `mx.empty` and
provide the functionality with the API design similar to Julia's Base.

        - `NDArray{T,N}(undef, dims...)`
        - `NDArray{T,N}(undef, dims)`
        - `NDArray{T}(undef,   dims...)`
        - `NDArray{T}(undef,   dims)`
        - `NDArray(undef,      dims...)`
        - `NDArray(undef,      dims)`

e.g.

```julia
julia> NDArray{Int,2}(undef, 5, 2)
5×2 NDArray{Int64,2} @ CPU0:
 94290755905104  94290752678143
 94290752660544     68719476760
 94290752674408  94290737734368
 94290752660544              18
 94290752674408              18

julia> NDArray(undef, 5, 2)  # default type is `mx.MX_float`
5×2 NDArray{Float32,2} @ CPU0:
 -29112.406f0       5.2029858f-8
      3.0763f-41    6.7375383f-10
      1.7613131f19  0.0f0
      4.840456f30   0.0f0
      4.4262863f30  0.0f0
```

- The original `mx.empty` APIs are still functional.
  If user invokes them, a deprecation warning will be popped up.

* Runtime feature detection (#13549)

* Prototype for runtime feature detection

* Includes from diamond to quotes

* Add CPU feature and BLAS flavour flags

* Add BLAS flavour and CPU SSE and AVX flags

* MXNET_USE_LAPACK

* Fix C++ linting errors

* Expose runtime feature detection in the public C API and in the Python API

* Refactor Storage -> FeatureSet

* Refine documentation

* Add failure case

* Fix pylint

* Address CR comments

* Reduce verbosity of container builds (wget output) (#13888)

* Add back R tests and fix typo around R and perl tests (#13940)

* Add back R tests and fix typo around R and perl tests

* Fix permissions

* Fix copy&paste mistake around roxygen and remove previous permission override

* fix doc of take operator (#13947)

* #13624 clojure nightly tests (#13624)

* Add erfinv operator for calculating inverse error function (#13811)

* add default behaviour for argmax

* prototype of erfvin

* add test

* gpu support

* Revert "add default behaviour for argmax"

This reverts commit 64e9f1a9e3c9cabf312b8d80b3520b22da31c0b6.

* move erfinv to contrib

* edit copyright

* remove atof

* use std and update license

* add license exclude file

* fix per eric's comments

* change license header

* Update project.clj file to use the snapshots repo to be able to pull (#13935)

nightly Scala jar - also update readme

* Julia: add windows-cpu build (#13937)

- Julia v0.7
- Julia v1.0

* split_v2 (#13687)

* Update autoencoder example (#12933)

* Fixing the autoencoder example

* adding pointer to VAE

* fix typos

* Update README.md

* Updating notebook

* Update after comments

* Update README.md

* Update README.md

* Retrigger build

* Updates after review

* Static build for Python (#13916)

* add python unit test

* address comments

* switch sanity test to Gluon module test

* We don't run tests (╯‵□′)╯︵┻━┻

* add variant in the environment variable

* add document improvement

* kill the conflict

* Flaky maven binary download (#13974)

* Aggregate SGD (#13346)

* Aggregate SGD

* Make OpWrapperGenerator understand Tuple<float>

* Trigger

* Add NNVM Tuple to cpp-package op.h

* Trigger

* Fix pylint aggregate SGD

* Update info about new ENV vars and modifying 2 tests that require
update_on_kvstore to be true

* Fix

* Aggregate SGD support for Gluon trainer

* Added text to doc about aggregate update in SGD optimizer

* Docs changes from review

* Gradient multiplier (contrib) operator (#13632)

* Added the gradient reversal contrib operator

Missing test for backwards pass

* Fixed linting errors

* Fixed forward test

* Added random forward / backward test for gradient reversal

* Update test_contrib_operator.py

* Fixed typo in gradient reversal op description

* Replace forward code with the identitiy implementation

* Fixed typos in function docs

* Changed default behavior to identity

* Replaced backward code with scalar_mul

* Fixed backward operator and unit test

* Renamed operator to gradient multiplier

* Update test_contrib_operator.py

Retrigger flaky test

* Update gradient_multiplier_op.cc

Improved the description of the scalar multiplier

* Update README.md (#13973)

* Fixing the doc for symbolic version of rand_zipfian (#13978)

* Fixes #12779

* Gluon end to end tutorial (#13411)

* initial draft gluon tutorial

* add reference

* add cpp inference

* improve wording

* address pr comments

* add util functions on dataset

* move util file

* update link

* fix typo, add test

* allow download

* update wording

* update links

* address comments

* use lr scheduler with optimizer

* separate into 2 tutorials

* add c++ tutorial to test whitelist

* [MXNET-1293] Adding Iterables instead of List to method signature for infer APIs in Java (#13977)

* Added Iterables as input type instead of List in Predictor for Java

* Added Iterables to ObjectDetector API

* Added tests for Predictor API

* Added tests for ObjectDetector

* Use CPUPinned context in ImageRecordIOParser2 (#13980)

* create NDArray with CPUPinned context in ImageRecordIOParser2

* update document

* use -1 device_id as an option to create CPU(0) context

* retrigger CI

* fix cpplint error

* Added optional parameters to BilinearResize2D to do relative scaling (#13985)

* Added optional parameters to BilinearResize2D to do relative scaling

* Removed unnecessary params in unit tests.

* Fixed deprecated casting style

* [MXNET-1301] Remove the unnecessary WaitAll statements from inception_inference example (#13972)

* Removed the unnecessary WaitAll statements

* Removed the WaitAll() calls wherever they are not necessary.

* [MXNET-1000] get Ndarray real value and form it from a NDArray (#12690)

* add visualize

* adding Any type input to form NDArray

* fix bug and add tests

* add a toString method

* add Visualize Util and migrate visualize structure to there

* update with tests

* refactor code

* fix the minor issue

* add multiple types support

* add changes on names and tests

* make code elegant and improve readability

* api change (#13903)

* ONNX export: Add Crop, Deconvolution and fix the default stride of Pooling to 1 (#12399)

* Added Deconvolution and Crop to ONNX exporter

* Added default for pool_type

* Sample python bilinear initializer at integral points in y-direction (#12983)

* Sample python bilinear initializer at integral points in y-direction

* Add unit test for bilinear initializer

* [MXNET-703] Minor refactor of TensorRT code (#13311)

* Python BucketingModule bind() with grad_req = 'add' (#13984)

* remember grad_req from bind and apply it to sub-modules

* unit-test for gradient accumulation with bucketing modules

* MXNET-1295 Adding integer index support to Sequence* family of operators. (#13880)

* Adding integer index support to Sequence* family of operators.

Adding ability to use int32 arrays, or any castable-to-int type, as
the sequence_length array to SequenceMask, SequenceLast, and
SequenceReverse. Previously these operaters all requred sequence_length
to be the same data type as the input array.

See MxNet Jira ticket here:
  https://issues.apache.org/jira/browse/MXNET-1295

See also GitHub issues here:
   https://github.com/apache/incubator-mxnet/issues/12649
   https://github.com/dmlc/gluon-nlp/issues/346

* Adding explicit braces to an if statement to fix g++ warning

* fixing sequence_mask.cu by adding IType to template

* Fixing whitespace errors reported by linter

* Adding unit tests

* Fixing length of lines to pass linter

* Disabled flaky test test_negative_binomial_generator (#13784)

* Fix website error pages (#13963)

* fix error redirect

* add error artifacts for local build

* build docs with CPP package (#13983)

* Update scala-package gitignore configuration. (#13962)

* [MXNET-1232] fix demo and add Eclipse support (#13979)

* fix demo and add Eclipse support

* fix on docs

* fix typo

* Update docs/install/java_setup.md

Co-Authored-By: lanking520 <[email protected]>

* add fixes in docs

* fix compile error in debug mode (#13873)

the latest BufferEntry do not contain ctx function and results in compile errors.
inside of BufferEntry is an object of NDArray, that is the expected data.

* Image normalize operator - GPU support, 3D/4D inputs (#13802)

* CPU version of normalize operator is working and unit test added

* Add GPU implementation and tests

* Working GPU normalize transforms

* Add default values, fix imports, fix documentation

* Add backward implmentation for image normalize

* Add tests for backward pass

* Move back operators to its original files

* Add review comments

* Add 4D example

* Make infer type generic

* Fix inline function build error

* make functions as inline to avoid multiple definition conflict across cc and cu

* Fix build errors

* Fix failing GPU tests

* remove debug; add support for v1.4.x docs; fix publish bug (#14015)

*  Return value docs for nd.random.* and sym.random.* (#13994)

* mx.random.multinomial python documentation updated, return type details added

* multinomial documentation clarified

* added basic case for negative_binomial

* added basic case for generalized_negative_binomial

* basic case added for gamma

* added basic case for exponential

* basic case added for randn

* remaining base cases added.

* randint case added

* cleaned up return types for random.py

* zboldyga added to contributors

* spacing typo correction

* updated symbol.random return types, minor correction to ndarray.random return types

* removed trailing whitespace in docs

* Julia: split ndarray.jl into several snippets (#14001)

- `ndarray/type.jl`
- `ndarray/context.jl`
- `ndarray/show.jl`
- `ndarray/remap.jl`
- `ndarray/array.jl`
- `ndarray/arithmetic.jl`
- `ndarray/comparison.jl`
- `ndarray/io.jl`
- `ndarray/reduction.jl`
- `ndarray/statistic.jl`
- `ndarray/linalg.jl`
- `ndarray/trig.jl`
- `ndarray/activation.jl`
- `ndarray/autoimport.jl`

* float32 -> float16 cast consistency across implementations (#13857)

* Added test showing float32->float16 discrepancy when mshadow float2half() is used.

* Temp update mshadow submodule SHA to point to PR368 (b211cb7).

* Temp switch to url = https://github.com/DickJC123/mshadow.git

* Updata mshadow submodule SHA.

* Improve code style per reviewer comments.

* Move back to dmlc/mshadow.git, now with float->half rounding.

* Expand test_operator.py:test_cast_float32_to_float16 to test np.nan.

* Improve bulking in Gluon (#13890)

* Improve bulking in Gluon

* Trigger CI

* Fix MXNet R package build (#13952)

* fix mxnet r package build

* add ci

* remove mkldnn-gpu test for R

* add minimal test for MKLDNN-R

* pick mlp as minimal R test

* Fix inconsistent handling for FResourceRequestEx for imperative and symbolic executor (#14007)

* Update op_attr_types.h

* Update attach_op_resource_pass.cc

* [MXNET-1180] Java Image API (#13807)

* add java example

* add test and change PredictorExample

* add image change

* Add minor fixes

* add License

* add predictor Example tests

* fix the issue with JUnit test

* Satisfy Lint God ʕ •ᴥ•ʔ

* update the pom file config

* update documentation

* add simplified methods

* Export resize and support batch size (#14014)

* add image resize operator and unit test

* refactor the resize operator and address lint issues

* address comment and add doc

* assert size is more than 2

* add test case of 4D input

* use ndarray datatype

* add inline to Shape

* add 4D input example

* refactor the duplicate code and separate the resize from image_random

* clean up the code

* add resize implementation

* delete the variable not used

* refactor the code with structure and enum to make code more understandable

* fix the lint

* address comments

* address comment 1. add description 2. refactor unit test and add dtype

* update data type check

* lint

* move the common utitlity to image_utils

* add default value for keep_ratio

* change the operator doc

* update the image utility function

* fix lint

* use Hang implementation to achieve image resize operator GPU

* update the check and doc

* refactor the caffe_gpu_interp2_kernel

* update doc and fix the cpu compile error

* update the comment

* fix lint

* add unit test for gpu

* address comments

* remove the crop and centercop utility function to make the PR clear

* fix the syntax error

* delete the warning

* add unit test with 4D

* fix typo

* add more unit test

* fix unit test

* set atol = 1

* fix missing numpy import

* fix the unit test

* delete test case

* fix unit test missing dependency

* fix error data type

* unify the style and add invalid interp

* update the doc

* add NAG optimizer to r api (#14023)

* Now passing DType of Label downstream to Label's DataDesc object (#14038)

* fix test_stn (#14063)

* re-enable test after issue fixed https://github.com/apache/incubator-mxnet/issues/10973 (#14032)

* Remove all usages of makefile for scala (#14013)

* Remove all usages of makefile for scala

* Unify making folders for scala/java setup

* Fix mxdoc path

* Add batch mode to calls

* fix nightly test on tutorials (#14036)

* fix nightly test

* fix typo

* trigger ci

* update the scala installation tutorial on intellij (#14033)

* update the scala installation tutorial on intellij

* update the so answer

* update the so answer

* Image ToTensor operator - GPU support, 3D/4D inputs (#13837)

* Add CPU implementation of ToTensor

* Add tests for cpu

* Add gpu implementation and tests

* Fix lint issues

* Cleanup includes

* Move back changes to original image operators files

* Add 4D example

* resolve merge conflicts

* Fix failing tests

* parallelize on channel in kernel launch

* rewrote the concat test to avoid flaky failures (#14049)

ran 10000 times with no failures

* Fix website scala doc (#14065)

* Fix doc building

* Remove deplicate in

* [Clojure] Add resource scope to clojure package (#13993)

* Add resource scope to clojure package

* add rat

* fix integration test

* feedback from @benkamphaus
- move from defs to atoms to make the tests a bit better

* adding alias with-do and with-let 
more tests

* another test

* Add examples in docstring

* refactor example and test to use resource-scope/with-let

* fix tests and problem with laziness 
now they work as expected!

* refactor to be a bit more modular

* remove comments

* Update NOTICE (#14043)

* modifying SyncBN doc for FP16 use case (#14041)

LGTM

* add new cloud providers to install page (#14039)

* add new cloud providers

* fix colon

* CUDNN dropout (#13896)

* cudnn dropout

* test dropout as stateful op

* add cudnn_off

* refactor

* fix bug when using inf forward

* turn on cudnn in gluon

* reuse dropout state space

* dropout passthrough

* address comments

* fix test_depthwise_convoltuion for occasional CI failures (#14016)

* keeping same contexts for comparison

* enabling test

* testing default context

* Revert "testing default context"

This reverts commit 1f95d0228178debde14680839bb6abab14c6d049.

* Disabling test due to CI failure on MKL-DNN

* ONNX export: broadcast_to, tile ops (#13981)

* Expand,tile op export

* fix

* adding test cases

* adding comments

* [MXNET-1258]fix unittest for ROIAlign Operator (#13609)

* fix roi align test

* retrigger unittest

* add more test detail for ROIAlign test

* remove url in test_op_roi_align

* remove blank line in test_op_roi_align in test_operator

* merge master

* Update test_operator.py

* retrigger CI

* Fix performance regression in normalize operator (#14055)

* parallelize on channel forward pass

* parallelize on channel normalize backward pass

* Fix lint issues

* Trying to fix CI build failure on GPU

* Fix failing GPU test on CI Do not pass normalize param as is to GPU kernel

* Fix to_tensor tests

* Pass mean and std_dev as native types for kernel

* Fix CI failure. Do not pass mean, std as vector to kernel

* Add maven wraper to scala project. (#13702)

* Increase perfomance of BulkAppend and BulkFlush (#14067)

* Better bulkappend

* Fix lint

* [MXNET-1178] updating scala docs (#14070)

* updating scala docs

* Addressed PR feedback

* update the version name (#14076)

* [MXNET-1121] Example to demonstrate the inference workflow using RNN (#13680)

* [MXNET-1121] Example to demonstrate the inference workflow using RNN

* Addressed the review comments. Updated the ReadMe files.

* Removed the unnecessary creation of NDArray

* Added the unit tests to nightly tests to catch the failure.

* Updated the makefiles and unit tests so that the examples are built and tested in nightly

* Added the visual representation of the model and fixed the CI failure.

* Added the missing pdf file.

* Fixing the broken ci_test.sh

* Update cpp-package/example/inference/README.md

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/README.md

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/README.md

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/README.md

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/README.md

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/simple_rnn.cpp

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/simple_rnn.cpp

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/simple_rnn.cpp

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/simple_rnn.cpp

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/simple_rnn.cpp

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/README.md

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/README.md

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/README.md

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/simple_rnn.cpp

Co-Authored-By: leleamol <[email protected]>

* Update cpp-package/example/inference/simple_rnn.cpp

Co-Authored-By: leleamol <[email protected]>

* Applying unresolved changes to README file.

* Fixing the CI build failure.

* Updated the RNN example from sequence generation to sentiment analysis

* Updated the readme files. Updated the example to use trained model and updated the unit test.

* Addressed the review comment to increase the default sequence length. Added the examples with inputs of various lengths.

* Updated the example to handle variable length input. Updated the readme and unit test files accordingly.

* Updated the example to share the memory between executors by createing shared executors.

* Updated the creation of executors from largest to smallest bucket key

* Creating the executor for the highest bucket key.

* Updated the unit test to check for the results in a range and modified the function name to be consistent with others.

* Fixed the logic to find the right bucket.

* hybridize rnn and add model graph (#13244)

* hybridize rnn and add model graph

* trigger CI

* separate mxboard visualization

* add options and she-bang

* add defaults

* trigger CI

* rename export-model

* Exclude concat layer  for gpu quantization (#14060)

* exclude concat for gpu quantization

* remove quantized_concat test in non-subgraph flow

* Remove inplace support for ToTensor operator (#14083)

* Remove stale check for op req type

* Do not register to tensor operator with in place option.

* [MKLDNN] Enable signed int8 support for convolution. (#13697)

* Enable s8s8 support for MKLDNN convolution.

* Fix cpp build

* Fix build.

* Fix build

* Remove openmp min/max reduction for windows build

* Add mkldnn_OIhw4i16o4i_s8s8 support

* Add all s8s8 weight format

* Change ssd quantize script.

* Update

* Manually cast mshadow shape size to size_t

* Fix merge.

* Fix perl package.

* Retrigger CI

* Fix GPU test

* Fix GPU test

* Rerun CI

* Rerun CI

* Rerun CI

* Rerun CI

* Remove weight_channelwise_scale from params.

* Fix

* Keep API compatible.

* Rerun CI

* Rerun CI

* Rerun CI

* Rerun CI

* Address comments.

* fix.

* Address debug build.

* Add comment for next_impl

* Rerun ci

* Add new api MXExecutorSetMonitorCallbackEX

* Add default value for monitor_all for cpp header.

* Rerun CI

* fix

* script change for uint8.

* trigger ci

* trigger ci

* [MXNET-1291] solve pylint errors in examples with issue no.12205 (#13815)

* Unify the style here

Unify the style here and remove the testing 'print' code segment.

* Unify the description of comment

Change the description of comment from "multi-layer perceptron" to "Get multi-layer perceptron"

* Unify the style of comments

Unify the style of comments suggested by @sandeep-krishnamurthy

* git pull the lastest code from master of incubator-mxnet

* Complete rebase

* Solve PEP8 [C0304 ] Final newline missing

Sovle example/deep-embedded-clustering/solver.py(150): [C0304 ] Final newline missing

* fix merge issue

* skip output_names unittest for mxnet-ngraph
  • Loading branch information
ashokei authored Feb 12, 2019
1 parent 631dbcf commit 5a444a1
Show file tree
Hide file tree
Showing 391 changed files with 15,516 additions and 5,136 deletions.
35 changes: 11 additions & 24 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -63,11 +63,10 @@ __pycache__
*.states
*.json
*.d
build
cmake-build*
data
model
recommonmark
deps

# R
*.Rcheck
Expand Down Expand Up @@ -96,6 +95,8 @@ input.txt*

# Jetbrain
.idea
.gradle
*.iml

# ctags
tags
Expand All @@ -104,28 +105,14 @@ tags
cscope.out
cscope.files

# Scala package
*.class
scala-package/*/target/
scala-package/*/*/target/
*.scala_dependencies
*.worksheet
*.idea
*.iml
*.classpath
*.project
*.settings
!scala-package/*/bin
*.bak
*/node_modules/

# Eclipse project config
.project
.cproject
.classpath
.settings
.pydevproject
CMakeFiles
cmake_install.cmake
lib

# Visual Studio Code
.vscode
Expand All @@ -145,12 +132,12 @@ tools/pip_package/mxnet.egg-info
tools/pip_package/mxnet

# temporary path for building dependencies when building wheel
./deps/
bld
./tmp/*
*.jar
target
bin/im2rec
deps/
staticdeps/
tmp/
build/
lib/
bin/
model/

# VTune
Expand Down
2 changes: 1 addition & 1 deletion 3rdparty/mshadow
Submodule mshadow updated 1 files
+94 −28 mshadow/half.h
28 changes: 25 additions & 3 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,17 @@
cmake_minimum_required(VERSION 3.0.2)

# workaround to store CMAKE_CROSSCOMPILING because is getting reset by the project command
if(CMAKE_CROSSCOMPILING)
set(__CMAKE_CROSSCOMPILING ${CMAKE_CROSSCOMPILING})
set(__CMAKE_CROSSCOMPILING_OVERRIDE ON)
endif()

project(mxnet C CXX)

if(__CMAKE_CROSSCOMPILING_OVERRIDE)
set(CMAKE_CROSSCOMPILING ${__CMAKE_CROSSCOMPILING})
endif()

if(EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/build/private/local_config.cmake)
include(${CMAKE_CURRENT_SOURCE_DIR}/build/private/local_config.cmake)
endif()
Expand All @@ -21,7 +31,7 @@ mxnet_option(USE_LAPACK "Build with lapack support" ON)
mxnet_option(USE_NGRAPH "Build with nGraph support" ON)
mxnet_option(USE_MKL_IF_AVAILABLE "Use MKL if found" ON)
mxnet_option(USE_MKLML_MKL "Use MKLDNN variant of MKL (if MKL found)" ON IF USE_MKL_IF_AVAILABLE AND (NOT APPLE))
mxnet_option(USE_MKLDNN "Use MKLDNN variant of MKL (if MKL found)" ON IF USE_MKL_IF_AVAILABLE AND (NOT APPLE) AND (NOT MSVC) AND (CMAKE_SYSTEM_PROCESSOR MATCHES x86_64) AND (NOT USE_NGRAPH))
mxnet_option(USE_MKLDNN "Use MKLDNN variant of MKL (if MKL found)" ON IF USE_MKL_IF_AVAILABLE AND (NOT APPLE) AND (NOT MSVC) AND (CMAKE_HOST_SYSTEM_PROCESSOR STREQUAL "x86_64") AND (NOT CMAKE_CROSSCOMPILING))
mxnet_option(USE_OPERATOR_TUNING "Enable auto-tuning of operators" ON IF NOT MSVC)
mxnet_option(USE_GPERFTOOLS "Build with GPerfTools support (if found)" ON)
mxnet_option(USE_JEMALLOC "Build with Jemalloc support" ON)
Expand All @@ -42,6 +52,10 @@ mxnet_option(USE_TENSORRT "Enable infeference optimization with TensorRT
mxnet_option(USE_ASAN "Enable Clang/GCC ASAN sanitizers." OFF)
mxnet_option(ENABLE_TESTCOVERAGE "Enable compilation with test coverage metric output" OFF)

message(STATUS "CMAKE_CROSSCOMPILING ${CMAKE_CROSSCOMPILING}")
message(STATUS "CMAKE_HOST_SYSTEM_PROCESSOR ${CMAKE_HOST_SYSTEM_PROCESSOR}")
message(STATUS "CMAKE_SYSTEM_PROCESSOR ${CMAKE_SYSTEM_PROCESSOR}")

message(STATUS "CMAKE_SYSTEM_NAME ${CMAKE_SYSTEM_NAME}")
if(USE_CUDA AND NOT USE_OLDCMAKECUDA)
message(STATUS "CMake version '${CMAKE_VERSION}' using generator '${CMAKE_GENERATOR}'")
Expand Down Expand Up @@ -257,6 +271,7 @@ if(USE_TENSORRT)
include_directories(${ONNX_PATH})
include_directories(3rdparty/onnx-tensorrt/)
include_directories(3rdparty/)
include_directories(3rdparty/onnx-tensorrt/third_party/onnx/)
add_definitions(-DMXNET_USE_TENSORRT=1)
add_definitions(-DONNX_NAMESPACE=onnx)

Expand Down Expand Up @@ -285,7 +300,7 @@ if(ENABLE_TESTCOVERAGE)
if(NOT GCOV_PATH)
message(FATAL_ERROR "gcov not found! Aborting...")
endif() # NOT GCOV_PATH

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} --coverage")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} --coverage")
set(CMAKE_LINKER_FLAGS "${CMAKE_LINKER_FLAGS} --coverage")
Expand All @@ -302,9 +317,11 @@ if(USE_MKLDNN)
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} /EHsc")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /EHsc /Gy")
endif()

set(WITH_TEST OFF CACHE INTERNAL "" FORCE)
set(WITH_EXAMPLE OFF CACHE INTERNAL "" FORCE)
set(ARCH_OPT_FLAGS "" CACHE INTERNAL "" FORCE)

add_subdirectory(3rdparty/mkldnn)

include_directories(3rdparty/mkldnn/include)
Expand All @@ -325,7 +342,6 @@ if(USE_CUDA)
if(NOT CUDA_TOOLSET)
set(CUDA_TOOLSET "${CUDA_VERSION_STRING}")
endif()
set(CMAKE_GENERATOR_TOOLSET "cuda=${CUDA_TOOLSET},host=x64")
else()
set(FIRST_CUDA FALSE)
endif()
Expand Down Expand Up @@ -477,12 +493,14 @@ if(USE_OPENMP)
endif()
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
add_definitions(-DMXNET_USE_OPENMP=1)
else()
if(OPENMP_FOUND)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} ${OpenMP_EXE_LINKER_FLAGS}")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OpenMP_EXE_LINKER_FLAGS}")
add_definitions(-DMXNET_USE_OPENMP=1)
endif()
endif()
elseif(UNIX AND NOT ANDROID)
Expand Down Expand Up @@ -815,6 +833,10 @@ install(TARGETS ${MXNET_INSTALL_TARGETS}
# https://cmake.org/cmake/help/v3.0/variable/CMAKE_INSTALL_PREFIX.html
# https://cmake.org/cmake/help/v3.0/module/GNUInstallDirs.html

# NOTE: Public headers will be installed into ${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_INCLUDEDIR}, see
# https://cmake.org/cmake/help/v3.0/variable/CMAKE_INSTALL_PREFIX.html
# https://cmake.org/cmake/help/v3.0/module/GNUInstallDirs.html

install(DIRECTORY include/ DESTINATION ${CMAKE_INSTALL_INCLUDEDIR})
install(DIRECTORY 3rdparty/tvm/nnvm/include/ DESTINATION ${CMAKE_INSTALL_INCLUDEDIR})
if (INSTALL_EXAMPLES)
Expand Down
1 change: 1 addition & 0 deletions CONTRIBUTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,7 @@ List of Contributors
* [Harsh Patel](https://github.com/harshp8l)
* [Xiao Wang](https://github.com/BeyonderXX)
* [Piyush Ghai](https://github.com/piyushghai)
* [Zach Boldyga](https://github.com/zboldyga)

Label Bot
---------
Expand Down
2 changes: 1 addition & 1 deletion MXNET_README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ Features
* Mix and match imperative and symbolic programming to maximize flexibility and efficiency
* Lightweight, memory efficient and portable to smart devices
* Scales up to multi GPUs and distributed setting with auto parallelism
* Support for Python, R, Scala, C++ and Julia
* Support for Python, Scala, C++, Java, Clojure, R and Julia
* Cloud-friendly and directly compatible with S3, HDFS, and Azure

License
Expand Down
29 changes: 24 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ endif
endif
endif
endif

ifeq ($(USE_MKL2017), 1)
$(warning "USE_MKL2017 is deprecated. We will switch to USE_MKLDNN.")
USE_MKLDNN=1
Expand Down Expand Up @@ -230,6 +231,16 @@ ifeq ($(USE_CUDNN), 1)
LDFLAGS += -lcudnn
endif

ifeq ($(use_blas), open)
CFLAGS += -DMXNET_USE_BLAS_OPEN=1
else ifeq ($(use_blas), atlas)
CFLAGS += -DMXNET_USE_BLAS_ATLAS=1
else ifeq ($(use_blas), mkl)
CFLAGS += -DMXNET_USE_BLAS_MKL=1
else ifeq ($(use_blas), apple)
CFLAGS += -DMXNET_USE_BLAS_APPLE=1
endif

# whether to use F16C instruction set extension for fast fp16 compute on CPU
# if cross compiling you may want to explicitly turn it off if target system does not support it
ifndef USE_F16C
Expand Down Expand Up @@ -485,6 +496,7 @@ endif
ifeq ($(CI), 1)
MAVEN_ARGS := -B
endif

# For quick compile test, used smaller subset
ALLX_DEP= $(ALL_DEP)

Expand All @@ -494,7 +506,7 @@ build/src/%.o: src/%.cc | mkldnn ngraph

build/src/%_gpu.o: src/%.cu | mkldnn ngraph
@mkdir -p $(@D)
$(NVCC) $(NVCCFLAGS) $(CUDA_ARCH) -Xcompiler "$(CFLAGS)" -M -MT build/src/$*_gpu.o $< >build/src/$*_gpu.d
$(NVCC) $(NVCCFLAGS) $(CUDA_ARCH) -Xcompiler "$(CFLAGS)" --generate-dependencies -MT build/src/$*_gpu.o $< >build/src/$*_gpu.d
$(NVCC) -c -o $@ $(NVCCFLAGS) $(CUDA_ARCH) -Xcompiler "$(CFLAGS)" $<

# A nvcc bug cause it to generate "generic/xxx.h" dependencies from torch headers.
Expand All @@ -521,6 +533,7 @@ build/plugin/%.o: plugin/%.cc | ngraph
ifeq ($(UNAME_S), Darwin)
LDFLAGS += -Wl,-install_name,@rpath/libmxnet.so
endif

# NOTE: to statically link libmxnet.a we need the option
# --Wl,--whole-archive -lmxnet --Wl,--no-whole-archive
lib/libmxnet.a: $(ALLX_DEP)
Expand Down Expand Up @@ -621,14 +634,21 @@ rpkg:
mkdir -p R-package/inst/libs
cp src/io/image_recordio.h R-package/src
cp -rf lib/libmxnet.so R-package/inst/libs

if [ -e "lib/libmkldnn.so.0" ]; then \
cp -rf lib/libmkldnn.so.0 R-package/inst/libs; \
cp -rf lib/libiomp5.so R-package/inst/libs; \
cp -rf lib/libmklml_intel.so R-package/inst/libs; \
fi

mkdir -p R-package/inst/include
cp -rf include/* R-package/inst/include
rm R-package/inst/include/dmlc
rm R-package/inst/include/nnvm
cp -rf 3rdparty/dmlc-core/include/* R-package/inst/include/
cp -rf 3rdparty/tvm/nnvm/include/* R-package/inst/include
Rscript -e "if(!require(devtools)){install.packages('devtools', repo = 'https://cloud.r-project.org/')}"
Rscript -e "if(!require(devtools)||packageVersion('roxygen2') < '6.1.1'){install.packages('roxygen2', repo = 'https://cloud.r-project.org/')}"
Rscript -e "if(!require(roxygen2)||packageVersion('roxygen2') < '6.1.1'){install.packages('roxygen2', repo = 'https://cloud.r-project.org/')}"
Rscript -e "library(devtools); library(methods); options(repos=c(CRAN='https://cloud.r-project.org/')); install_deps(pkg='R-package', dependencies = TRUE)"
cp R-package/dummy.NAMESPACE R-package/NAMESPACE
echo "import(Rcpp)" >> R-package/NAMESPACE
Expand Down Expand Up @@ -674,9 +694,8 @@ clean: rclean cyclean $(EXTRA_PACKAGES_CLEAN)
$(RM) -r $(patsubst %, %/*.d, $(EXTRA_OPERATORS)) $(patsubst %, %/*/*.d, $(EXTRA_OPERATORS))
$(RM) -r $(patsubst %, %/*.o, $(EXTRA_OPERATORS)) $(patsubst %, %/*/*.o, $(EXTRA_OPERATORS))
else
clean: ngraph_clean rclean mkldnn_clean cyclean testclean $(EXTRA_PACKAGES_CLEAN)
$(RM) -r build lib bin *~ */*~ */*/*~ */*/*/*~ R-package/NAMESPACE R-package/man R-package/R/mxnet_generated.R \
R-package/inst R-package/src/image_recordio.h R-package/src/*.o R-package/src/*.so mxnet_*.tar.gz
clean: rclean mkldnn_clean cyclean testclean $(EXTRA_PACKAGES_CLEAN)
$(RM) -r build lib bin *~ */*~ */*/*~ */*/*/*~
cd $(DMLC_CORE); $(MAKE) clean; cd -
cd $(PS_PATH); $(MAKE) clean; cd -
cd $(NNVM_PATH); $(MAKE) clean; cd -
Expand Down
2 changes: 1 addition & 1 deletion NOTICE
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Apache MXNET (incubating)
Copyright 2017-2018 The Apache Software Foundation
Copyright 2017 and onwards The Apache Software Foundation

This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
2 changes: 1 addition & 1 deletion R-package/R/context.R
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ init.context.default <- function() {

#' Set/Get default context for array creation.
#'
#' @param new, optional takes \code{mx.cpu()} or \code{mx.gpu(id)}, new default ctx.
#' @param new optional takes \code{mx.cpu()} or \code{mx.gpu(id)}, new default ctx.
#' @return The default context.
#'
#' @export
Expand Down
2 changes: 1 addition & 1 deletion R-package/R/model.R
Original file line number Diff line number Diff line change
Expand Up @@ -562,7 +562,7 @@ mx.model.FeedForward.create <-
#'
#' @param model The MXNet Model.
#' @param X The dataset to predict.
#' @param ctx mx.cpu() or mx.gpu(i) The device used to generate the prediction.
#' @param ctx mx.cpu() or mx.gpu(). The device used to generate the prediction.
#' @param array.batch.size The batch size used in batching. Only used when X is R's array.
#' @param array.layout can be "auto", "colmajor", "rowmajor", (detault=auto)
#' The layout of array. "rowmajor" is only supported for two dimensional array.
Expand Down
Loading

0 comments on commit 5a444a1

Please sign in to comment.