Error was detected while a MEX-file was running and MATLAB is exiting because of fatal error
Naveen kumar Elumalai
on 14 Mar 2019
Commented: Srinidhi Ganeshan
on 21 Mar 2019
I am trying to run the batched version of QR (dgeqrfbatched) in matlab using CUBLAS by calling it from a mex file. I am struck with this error which i am not able to find the answer , Is there any work around for this problem? I am attaching the code that i am running and also the crash report
#include "mex.h"
#include "cublas_v2.h"
// The MEX gateway function.
void mexFunction(int nlhs, mxArray *plhs[], int nrhs,const mxArray *prhs[])
// Get input variables from Matlab (host variables).
double **A;
// Get dimensions of input variables from Matlab.
size_t m, n, k;
const mwSize *Adims;
Adims = mxGetDimensions(prhs[0]);
//Bdims = mxGetDimensions(prhs[1]);
m = Adims[0];
n = Adims[1];
k = Adims[2];
A = (double**)mxGetPr(prhs[0]);
int lda = m;
const int batchSize=k;
//step -1 Allocate storage for batch count
double **tau;
tau = (double**)malloc(batchSize * sizeof(double*));
for (int i = 0; i < batchSize; i++)
tau[i] = (double*)malloc(n * sizeof(double));
int *info;
info = (int*)malloc(batchSize * sizeof(int));
//step -2 create host pointer array to the gpu array
double **d_A, **d_TAU, **h_d_A, **h_d_TAU;
h_d_A = (double**)malloc(batchSize * sizeof(double*));
h_d_TAU = (double**)malloc(batchSize * sizeof(double*));
for (int i = 0; i < batchSize; i++) {
cudaMalloc((double**)&h_d_A[i], m*n * sizeof(double));
cudaMalloc((double**)&h_d_TAU[i], n * sizeof(double));
//step -3 copy host array of pointers to device
cudaMalloc((double**)&d_A, batchSize * sizeof(double*));
cudaMalloc((double**)&d_TAU, batchSize * sizeof(double));
cudaMemcpy(d_A, h_d_A, batchSize * sizeof(double*), cudaMemcpyHostToDevice);
cudaMemcpy(d_TAU, h_d_TAU, batchSize * sizeof(double*), cudaMemcpyHostToDevice);
for (int i = 0; i < batchSize; i++)
cudaMemcpy(h_d_A[i], A[i], m *n * sizeof(double), cudaMemcpyHostToDevice);
cudaMemcpy(h_d_TAU[i], tau[i], n * sizeof(double), cudaMemcpyHostToDevice);
// --- CUBLAS initialization
cublasHandle_t cublas_handle;
cublasDgeqrfBatched(cublas_handle, m, n, d_A, lda, d_TAU, info, batchSize);
for (int i = 0; i < batchSize; i++)
cudaMemcpy(A[i], h_d_A[i], m*n * sizeof(double), cudaMemcpyDeviceToHost);
//print the A matrix
for (int k = 0; k < batchSize; k++) {
for (int j = 0; j < m; j++) {
for (int i = 0; i < n; i++) {
int index = j * m + i;//not tested
//count = count + 1;
printf("\n %d The values are %lf",k+index, A[k][index]);
} // i
} // j
} // k
When i execute the above program this is the crash report i am getting.
Segmentation violation detected at Thu Mar 14 08:52:21 2019 -0700
Crash Decoding : Disabled - No sandbox or build area path
Crash Mode : continue (default)
Default Encoding : UTF-8
Deployed : false
GNU C Library : 2.24 stable
Graphics Driver : Unknown software
Java Version : Java 1.8.0_144-b01 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
MATLAB Architecture : glnxa64
MATLAB Entitlement ID : 1378095
MATLAB Root : /cvmfs/
MATLAB Version : (R2018a)
OpenGL : software
Operating System : "CentOS Linux release 7.6.1810 (Core) "
Process ID : 171494
Processor ID : x86 Family 6 Model 79 Stepping 1, GenuineIntel
Session Key : 32be7088-53bb-46ac-a878-a0e4028bfd50
Static TLS mitigation : Disabled: Unnecessary 1
Window System : No active display
Fault Count: 1
Abnormal termination
Register State (from fault):
RAX = 0000000000000000 RBX = 0000000000000000
RCX = 00002b105df7b080 RDX = 0000000000000000
RSP = 00002b1073ffcc00 RBP = 00002b1073ffcc10
RSI = 00002b1073ffcf10 RDI = 0000000000000000
R8 = 00002b1042d133e8 R9 = 0000000000000030
R10 = 000000000000042b R11 = 00002b1047aff750
R12 = 0000000000000000 R13 = 00002b1073ffcf10
R14 = 0000000000000000 R15 = 00002b105df7b080
RIP = 00002b1047aa710c EFL = 0000000000010206
CS = 0033 FS = 0000 GS = 0000
Stack Trace (from fault):
[ 0] 0x00002b1047aa710c bin/glnxa64/ _ZN6matrix6detail10noninlined12mx_array_api15mxGetDimensionsEPK11mxArray_tag+00000012
[ 1] 0x00002b10e29b6c00 /home/naveen/Matlab/Mexcuda/example.mexa64+00003072 mexFunction+00000106
[ 2] 0x00002b105dd49080 bin/glnxa64/
[ 3] 0x00002b105dd49447 bin/glnxa64/
[ 4] 0x00002b105dd49f2b bin/glnxa64/
[ 5] 0x00002b105dd3430c bin/glnxa64/
[ 6] 0x00002b105bdca2ad bin/glnxa64/ _ZN8Mfh_file16dispatch_fh_implEMS_FviPP11mxArray_tagiS2_EiS2_iS2_+00000829
[ 7] 0x00002b105bdcabae bin/glnxa64/ _ZN8Mfh_file11dispatch_fhEiPP11mxArray_tagiS2_+00000030
[ 8] 0x00002b105ee70da1 bin/glnxa64/
[ 9] 0x00002b105ee71982 bin/glnxa64/
[ 10] 0x00002b105ef59fc9 bin/glnxa64/
[ 11] 0x00002b105eefb431 bin/glnxa64/
[ 12] 0x00002b105e7015a8 bin/glnxa64/
[ 13] 0x00002b105e703cbc bin/glnxa64/
[ 14] 0x00002b105e70001d bin/glnxa64/
[ 15] 0x00002b105e6f9ba1 bin/glnxa64/
[ 16] 0x00002b105e6f9dd9 bin/glnxa64/
[ 17] 0x00002b105e6ff846 bin/glnxa64/
[ 18] 0x00002b105e6ff92f bin/glnxa64/
[ 19] 0x00002b105e82e503 bin/glnxa64/
[ 20] 0x00002b105e831cf3 bin/glnxa64/
[ 21] 0x00002b105ed41f6d bin/glnxa64/
[ 22] 0x00002b105ecef60c bin/glnxa64/
[ 23] 0x00002b105ecf6448 bin/glnxa64/
[ 24] 0x00002b105ecf7e22 bin/glnxa64/
[ 25] 0x00002b105ed85807 bin/glnxa64/
[ 26] 0x00002b105ed85aea bin/glnxa64/
[ 27] 0x00002b105dab591a bin/glnxa64/ _Z8mnParserv+00000874
[ 28] 0x00002b105b7bebb8 bin/glnxa64/
[ 29] 0x00002b1048475e9f bin/glnxa64/ _ZNSt13__future_base13_State_baseV29_M_do_setEPSt8functionIFSt10unique_ptrINS_12_Result_baseENS3_8_DeleterEEvEEPb+00000031
[ 30] 0x00002b1046e464f9 /cvmfs/
[ 31] 0x00002b1048476126 bin/glnxa64/ _ZSt9call_onceIMNSt13__future_base13_State_baseV2EFvPSt8functionIFSt10unique_ptrINS0_12_Result_baseENS4_8_DeleterEEvEEPbEJPS1_S9_SA_EEvRSt9once_flagOT_DpOT0_+00000102
[ 32] 0x00002b105b7be9d3 bin/glnxa64/
[ 33] 0x00002b10435a61a2 bin/glnxa64/ _ZN14cmddistributor15PackagedTaskIIP10invokeFuncIN7mwboost8functionIFvvEEEEENS2_10shared_ptrINS2_13unique_futureIDTclfp_EEEEEERKT_+00000082
[ 34] 0x00002b10435a64e8 bin/glnxa64/ _ZNSt17_Function_handlerIFN7mwboost3anyEvEZN14cmddistributor15PackagedTaskIIP10createFuncINS0_8functionIFvvEEEEESt8functionIS2_ET_EUlvE_E9_M_invokeERKSt9_Any_data+00000024
[ 35] 0x00002b105b206e6c bin/glnxa64/ _ZN7mwboost6detail8function21function_obj_invoker0ISt8functionIFNS_3anyEvEES4_E6invokeERNS1_15function_bufferE+00000028
[ 36] 0x00002b105b20697f bin/glnxa64/ _ZN3iqm18PackagedTaskPlugin7executeEP15inWorkSpace_tagRN7mwboost10shared_ptrIN14cmddistributor17IIPCompletedEventEEE+00000447
[ 37] 0x00002b105b1e4ab1 bin/glnxa64/
[ 38] 0x00002b105b1c7ac8 bin/glnxa64/
[ 39] 0x00002b105b1c28bf bin/glnxa64/
[ 40] 0x00002b1044694a05 bin/glnxa64/
[ 41] 0x00002b1044695ff2 bin/glnxa64/
[ 42] 0x00002b10446968fb bin/glnxa64/ _Z25svWS_ProcessPendingEventsiib+00000187
[ 43] 0x00002b105b7bffc3 bin/glnxa64/
[ 44] 0x00002b105b7c06a4 bin/glnxa64/
[ 45] 0x00002b105b7b93f1 bin/glnxa64/
[ 46] 0x00002b1046e3f1f4 /cvmfs/
[ 47] 0x00002b104554c16f /cvmfs/ clone+00000095
[ 48] 0x0000000000000000 <unknown-module>+00000000
This error was detected while a MEX-file was running. If the MEX-file
is not an official MathWorks function, please examine its source code
for errors. Please consult the External Interfaces Guide for information
on debugging MEX-files.
Edric Ellis
on 15 Mar 2019
Not really an answer to your question as such - but note that if you have Parallel Computing Toolbox, you might be able to use pagefun - it doesn't support QR directly, but it does support batched mldivide...
Accepted Answer
James Tursa
on 14 Mar 2019
Edited: James Tursa
on 14 Mar 2019
Can you explain what you intended with these lines for A:
double **A;
A = (double**)mxGetPr(prhs[0]);
If you pass in a regular double array, there are doubles in the data area of prhs[0], not pointers to doubles. You've got one too many levels of indirection here. What were your intentions with this?
Downstream in your code you appear to use A[i] as a pointer in a memory copy. Since there are doubles behind A, and not pointers to doubles behind A, you would be using a floating point double bit pattern as a pointer and this will crash MATLAB.
James Tursa
on 15 Mar 2019
Edited: James Tursa
on 21 Mar 2019
Using A will point to the first batch (we typically use the term "plane" or "page" here to refer to the first 2D slice of a multi-dimensional array). To point to the next plane, simply increment the pointer by the appropriate amount. E.g.,
A points to the first plane
A+m*n points to the second plane
A+m*n*2 points to the third plane
A+m*n*3 points to the fourth plane
So, programatically you would simply use A+m*n*i as your pointer to the plane you want to process, where i is a 0-based index (like you currently have in your for-loop).
More Answers (0)
