Skip to content

Commit

Permalink
nvptx-none-run: Reduce CU_LIMIT_STACK_SIZE from 256 KiB to 128 KiB.
Browse files Browse the repository at this point in the history
... to work around <#8>.

According to Table 12, Technical Specifications per Compute Capability, on
<http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#features-and-technical-specifications>,
there is 512 KiB of local memory per thread, so a stack with 256 KiB seemed
workable (and indeed is, with Nvidia Quadro K1000M hardware, driver version
340.24, CUDA 5.5 installation), but not on a system with Nvidia Tesla K20c
hardware, driver version 319.37, CUDA 5.5 installation.

On the Nvidia Quadro K1000M system, no changes in GCC testsuite results.
  • Loading branch information
tschwinge committed Feb 13, 2015
1 parent 2104452 commit b4ca018
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions nvptx-run.c
Original file line number Diff line number Diff line change
Expand Up @@ -224,14 +224,15 @@ This program has absolutely no warranty.\n",
}

#if 0
/* Default seems to be 8k stack, 8M heap. */
/* Default seems to be 1 KiB stack, 8 MiB heap. */
size_t stack, heap;
cuCtxGetLimit (&stack, CU_LIMIT_STACK_SIZE);
cuCtxGetLimit (&heap, CU_LIMIT_MALLOC_HEAP_SIZE);
printf ("stack %ld heap %ld\n", stack, heap);
#endif

r = cuCtxSetLimit(CU_LIMIT_STACK_SIZE, 256 * 1024);
/* <https://github.com/MentorEmbedded/nvptx-tools/issues/8>. */
r = cuCtxSetLimit(CU_LIMIT_STACK_SIZE, 128 * 1024);
fatal_unless_success (r, "could not set stack limit");
r = cuCtxSetLimit(CU_LIMIT_MALLOC_HEAP_SIZE, 256 * 1024 * 1024);
fatal_unless_success (r, "could not set heap limit");
Expand Down

0 comments on commit b4ca018

Please sign in to comment.