Viewer.add_points significantly slower after upgrade to v-0.4

Hi everyone,

I recently upgraded Napari to v-0.4 from v-0.3.4. I noticed that adding points to a viewer has become significantly slower (even on small images). For larger images, it does not even load. The same task used to take a few seconds previously (before the upgrade). Is this an expected bug? Wondering if I should revert back to 0.3.4.

Thank you,
Ajit.

Hi @ajitjohnson, sorry to hear that! I’d love it if you could provide a minimal example that I can test with, meanwhile I’ll do some investigations on my end (i.e. is it just the initial call to add_points, is it adding a point when many already exist, all of the above).

We definitely want to avoid performance regressions like that and have a couple benchmarks that we measure, but may have missed something here.

We’re preparing the 0.4.1 release right now, and so might even be able to get a fix in if we knew exactly where to look. I’ll report back as I find out more.

Here are some raw benchmark results running the following command at different points

asv run -E existing -b benchmark_points_layer.Points2DSuite`

which runs this points 2D benchmark suite, which just has some basic tests in it. Note I’ve done this for the following napari versions all in the same environment on my mac, so this isn’t a full rigorous test, just a first pass to try and catch anything obvious. There’s also a lot about GUI interactions that isn’t tested here too.

The full results are below, but it does look like there is a slight regression going from 0.3.4 to 0.4.0 (and not much change to the 0.4.1 release candidate) in the time for an add_points call with 65536 points. It has gone from 28.5ms to 35.5ms to 34.7ms respectively, so a slowdown of about 20 to 25%, which is not negligible, and we may want to look into, but doesn’t sound like a transition from a few seconds to not usable. I’ve also included the 0.3.8 performance, which is 31.6ms for that benchmark, so fairly in the middle, suggesting that at least in this case the regression might be the accumulation of multiple small things.

If you can provide more details of your usecase, number of points, 2D/ 3D, custom colors, sizes, etc then we can hopefully chase this down better.

Thanks!!


Full benchmark results

0.4.1 pre release

[ 71.43%] ··· benchmark_points_layer.Points2DSuite.time_create_layer                                                                                                                                      ok
[ 71.43%] ··· ======== =============
               param1
              -------- -------------
                 16     5.39±0.09ms
                 64     5.38±0.08ms
                256      5.58±0.1ms
                1024    5.99±0.09ms
                4096     7.62±0.1ms
               16384     12.6±0.3ms
               65536     34.7±0.8ms
              ======== =============

[ 78.57%] ··· benchmark_points_layer.Points2DSuite.time_get_value                                                                                                                                         ok
[ 78.57%] ··· ======== =============
               param1
              -------- -------------
                 16       695±10μs
                 64       712±20μs
                256       711±9μs
                1024      758±20μs
                4096      995±20μs
               16384    1.93±0.05ms
               65536     5.73±0.1ms
              ======== =============

[ 85.71%] ··· benchmark_points_layer.Points2DSuite.time_refresh                                                                                                                                           ok
[ 85.71%] ··· ======== =============
               param1
              -------- -------------
                 16     1.68±0.04ms
                 64     1.76±0.07ms
                256     1.75±0.01ms
                1024    1.87±0.03ms
                4096    2.50±0.08ms
               16384     4.31±0.1ms
               65536     12.7±0.2ms
              ======== =============

[ 92.86%] ··· benchmark_points_layer.Points2DSuite.time_set_view_slice                                                                                                                                    ok
[ 92.86%] ··· ======== ==========
               param1
              -------- ----------
                 16     202±5μs
                 64     207±1μs
                256     208±9μs
                1024    218±2μs
                4096    247±10μs
               16384    347±5μs
               65536    730±10μs
              ======== ==========

[100.00%] ··· benchmark_points_layer.Points2DSuite.time_update_thumbnail                                                                                                                                  ok
[100.00%] ··· ======== =============
               param1
              -------- -------------
                 16       321±5μs
                 64       333±4μs
                256       360±8μs
                1024      433±10μs
                4096      766±30μs
               16384    1.68±0.02ms
               65536     5.48±0.2ms
              ======== =============

0.4.0 release

[ 71.43%] ··· benchmark_points_layer.Points2DSuite.time_create_layer                                                                                                                                      ok
[ 71.43%] ··· ======== ============
               param1
              -------- ------------
                 16     4.99±0.2ms
                 64     5.05±0.1ms
                256     5.11±0.2ms
                1024    5.42±0.2ms
                4096    7.02±0.2ms
               16384    12.5±0.2ms
               65536    35.5±0.7ms
              ======== ============

[ 78.57%] ··· benchmark_points_layer.Points2DSuite.time_get_value                                                                                                                                         ok
[ 78.57%] ··· ======== =============
               param1
              -------- -------------
                 16       557±40μs
                 64      659±200μs
                256       531±4μs
                1024      588±10μs
                4096      840±20μs
               16384    1.78±0.02ms
               65536    5.69±0.08ms
              ======== =============

[ 85.71%] ··· benchmark_points_layer.Points2DSuite.time_refresh                                                                                                                                           ok
[ 85.71%] ··· ======== =============
               param1
              -------- -------------
                 16     1.54±0.03ms
                 64     1.51±0.01ms
                256     1.60±0.02ms
                1024    1.73±0.03ms
                4096    2.36±0.08ms
               16384     4.46±0.1ms
               65536     13.0±0.5ms
              ======== =============

[ 92.86%] ··· benchmark_points_layer.Points2DSuite.time_set_view_slice                                                                                                                                    ok
[ 92.86%] ··· ======== ===========
               param1
              -------- -----------
                 16      301±10μs
                 64      299±9μs
                256      301±7μs
                1024     310±10μs
                4096     347±7μs
               16384    460±100μs
               65536     872±20μs
              ======== ===========

[100.00%] ··· benchmark_points_layer.Points2DSuite.time_update_thumbnail                                                                                                                                  ok
[100.00%] ··· ======== =============
               param1
              -------- -------------
                 16       337±9μs
                 64       349±7μs
                256       365±6μs
                1024      459±10μs
                4096      782±20μs
               16384    1.79±0.07ms
               65536     5.79±0.2ms
              ======== =============

0.3.4 release

[ 71.43%] ··· benchmark_points_layer.Points2DSuite.time_create_layer                                                                                                                                      ok
[ 71.43%] ··· ======== ============
               param1
              -------- ------------
                 16     2.51±0.4ms
                 64     2.72±0.4ms
                256     2.59±0.1ms
                1024    2.88±0.2ms
                4096    4.16±0.1ms
               16384     9.81±1ms
               65536     28.5±1ms
              ======== ============

[ 78.57%] ··· benchmark_points_layer.Points2DSuite.time_get_value                                                                                                                                         ok
[ 78.57%] ··· ======== =============
               param1
              -------- -------------
                 16       77.7±8μs
                 64       83.9±7μs
                256       95.1±4μs
                1024      168±9μs
                4096      390±10μs
               16384    1.36±0.05ms
               65536     5.17±0.1ms
              ======== =============

[ 85.71%] ··· benchmark_points_layer.Points2DSuite.time_refresh                                                                                                                                           ok
[ 85.71%] ··· ======== =============
               param1
              -------- -------------
                 16       718±60μs
                 64       737±30μs
                256      824±100μs
                1024     926±100μs
                4096    1.43±0.02ms
               16384    3.04±0.07ms
               65536     9.60±0.2ms
              ======== =============

[ 92.86%] ··· benchmark_points_layer.Points2DSuite.time_set_view_slice                                                                                                                                    ok
[ 92.86%] ··· ======== ============
               param1
              -------- ------------
                 16     52.1±0.9μs
                 64      53.1±3μs
                256      48.7±2μs
                1024     52.7±4μs
                4096     53.7±2μs
               16384     72.1±4μs
               65536     141±7μs
              ======== ============

[100.00%] ··· benchmark_points_layer.Points2DSuite.time_update_thumbnail                                                                                                                                  ok
[100.00%] ··· ======== =============
               param1
              -------- -------------
                 16       295±10μs
                 64       292±9μs
                256       310±10μs
                1024      371±10μs
                4096      605±30μs
               16384    1.13±0.04ms
               65536    3.55±0.06ms
              ======== =============

0.3.8 release

[ 71.43%] ··· benchmark_points_layer.Points2DSuite.time_create_layer                                                                                                                                      ok
[ 71.43%] ··· ======== =============
               param1
              -------- -------------
                 16     3.03±0.07ms
                 64     2.96±0.06ms
                256     3.01±0.08ms
                1024     3.65±0.2ms
                4096     5.09±0.1ms
               16384     10.3±0.2ms
               65536      31.6±1ms
              ======== =============

[ 78.57%] ··· benchmark_points_layer.Points2DSuite.time_get_value                                                                                                                                         ok
[ 78.57%] ··· ======== =============
               param1
              -------- -------------
                 16       73.2±3μs
                 64       78.6±3μs
                256       99.2±4μs
                1024      148±5μs
                4096      417±20μs
               16384    1.28±0.03ms
               65536     4.85±0.2ms
              ======== =============

[ 85.71%] ··· benchmark_points_layer.Points2DSuite.time_refresh                                                                                                                                           ok
[ 85.71%] ··· ======== =============
               param1
              -------- -------------
                 16       886±50μs
                 64       896±20μs
                256     1.01±0.03ms
                1024    1.20±0.09ms
                4096    1.81±0.04ms
               16384     3.76±0.1ms
               65536     11.8±0.5ms
              ======== =============

[ 92.86%] ··· benchmark_points_layer.Points2DSuite.time_set_view_slice                                                                                                                                    ok
[ 92.86%] ··· ======== ==========
               param1
              -------- ----------
                 16     231±4μs
                 64     231±5μs
                256     254±20μs
                1024    258±20μs
                4096    292±4μs
               16384    392±10μs
               65536    799±20μs
              ======== ==========

[100.00%] ··· benchmark_points_layer.Points2DSuite.time_update_thumbnail                                                                                                                                  ok
[100.00%] ··· ======== =============
               param1
              -------- -------------
                 16       327±8μs
                 64       321±2μs
                256       354±6μs
                1024      437±10μs
                4096      793±20μs
               16384    1.70±0.04ms
               65536     5.47±0.1ms
              ======== =============

Thanks a lot for your quick response. I am trying to troubleshoot this now.

Here is the run time on a small image (doing the exact same thing, before and after the upgrade).

Old script (runs in 5 seconds): https://github.com/ajitjohnson/scimap/blob/127d471f840e6b463cb9d758220a6aa96f570fef/scimap/plotting/_gate_finder.py

New script (runs in 25 seconds):

As you would see that the additional things that I added to the new script are below and also updated Napari to the latest version.

import dask.array as da
import zarr

However, the loading of the image does not determine the time that I had mentioned above, It is purely due to add_points step.

I wonder if adding points to a zarr based viewer takes more time than loading the entire image into memory. I am figuring that out now and will report the updates.

ON A SEPERATE NOTE:

I created a new Conda environment and installed the latest version of Napari. In windows, I get the following error (tested in both jupyter notebook and spyder IDE). Note the same installation works on mac. Any idea why? I saw that there was a similar issue a while time back but was resolved? https://github.com/napari/napari/issues/1254

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-5-1e1049a33073> in <module>
----> 1 import napari

~\AppData\Local\Continuum\anaconda3\envs\scimap_new\lib\site-packages\napari\__init__.py in <module>
     58 # decorated with keybindings during that process, but it is not directly needed
     59 # by our users and so is deleted below
---> 60 from . import _viewer_key_bindings  # noqa: F401
     61 from ._qt import gui_qt
     62 from .plugins.io import save_layers

~\AppData\Local\Continuum\anaconda3\envs\scimap_new\lib\site-packages\napari\_viewer_key_bindings.py in <module>
      1 import numpy as np
      2 
----> 3 from .viewer import Viewer
      4 
      5 

~\AppData\Local\Continuum\anaconda3\envs\scimap_new\lib\site-packages\napari\viewer.py in <module>
      7 
      8 from . import __version__
----> 9 from ._qt.qt_main_window import Window
     10 from ._qt.qt_viewer import QtViewer
     11 from ._qt.qthreading import create_worker, wait_for_workers_to_quit

~\AppData\Local\Continuum\anaconda3\envs\scimap_new\lib\site-packages\napari\_qt\__init__.py in <module>
      3 from .widgets.qt_range_slider import QHRangeSlider, QVRangeSlider
      4 
----> 5 import_resources()

~\AppData\Local\Continuum\anaconda3\envs\scimap_new\lib\site-packages\napari\resources\__init__.py in import_resources(version, overwrite)
    117     # regenerated.
    118     overwrite = overwrite or bool(environ.get('NAPARI_REBUILD_RESOURCES'))
--> 119     respath = build_pyqt_resources(fspath(target_file), overwrite=overwrite)
    120 
    121     # import the python resources file and add to sys.modules

~\AppData\Local\Continuum\anaconda3\envs\scimap_new\lib\site-packages\napari\resources\build_icons.py in build_pyqt_resources(out_path, overwrite)
    216     #     3.7.4/envs/napari-pyqt5
    217     #     napari-pyqt5
--> 218     for name in _find_rcc_or_raise():
    219         try:
    220             check_call([name, '-o', out_path, qrc_path])

~\AppData\Local\Continuum\anaconda3\envs\scimap_new\lib\site-packages\napari\resources\build_icons.py in _find_rcc_or_raise()
    168             yield rcc_binary
    169     raise FileNotFoundError(
--> 170         "Unable to find an executable to build Qt resources (icons).\n"
    171         "Tried: 'pyrcc5.bat', 'pyrcc5', 'pyside2-rcc'.\n"
    172         "Please open issue at https://github.com/napari/napari/issues/."

FileNotFoundError: Unable to find an executable to build Qt resources (icons).
Tried: 'pyrcc5.bat', 'pyrcc5', 'pyside2-rcc'.
Please open issue at https://github.com/napari/napari/issues/.

Thank you for taking a look at this.

UPDATE:

Here is what I did.

  1. Ran the old and new script in the old environment:
Napari version: 0.3.8
Dask version: 2.30.0
Zarr version: 2.5.0
pyqt5: 5.12.3

Result:

  • old script: 5 seconds
  • new script: 5 seconds

Suggests that the other changes that I had made in the script do not affect the run time. Like Zarr based image loading implementation.

  1. Ran the old and new script in the New environment:
Napari version: 0.4.0
Dask version: 2.30.0
Zarr version: 2.5.0
pyqt5: 5.12.3

Result:

  • old script: 23 seconds
  • new script: 23 seconds

Do you think the latest version of Napari behaves differently while adding points to a zarr based viewer? I am also going to check if there is any difference between versions of other dependencies.

UPDATE:

These are the only things different between the old and new envs.

45c45
< importlib-metadata=2.0.0=pypi_0
---
> importlib-metadata=3.0.0=pypi_0

78c78
< napari=0.4.0=pypi_0
---
> napari=0.3.8=pypi_0

262c262
< scimap=0.9.12=pypi_0
---
> scimap=0.8.9=pypi_0

Looks like napari and importlib-metadata are the only difference that is causing the difference in speed.

UPDATE:

So the delay in time mainly originates from adding multiple add.points layers sequentially.

New method- time to add each layer: (Napari 0.4)

0.06848001480102539
0.07346773147583008
0.07529902458190918
0.09064173698425293
0.12967920303344727
0.14171504974365234
0.1455059051513672
0.26712775230407715
0.19836711883544922
0.23760318756103516
0.26413917541503906
0.2744309902191162
0.31336212158203125
0.34906005859375
0.3858470916748047
0.44211292266845703
0.4571719169616699
0.502331018447876
0.5380890369415283
0.5947880744934082
0.6245160102844238
0.6787590980529785
0.7344608306884766
0.7649087905883789
0.8239521980285645
0.8935399055480957
0.9337267875671387
1.0980708599090576
1.041450023651123
1.1222071647644043

old method: (Napari 0.3.8)

0.036271095275878906
0.03895974159240723
0.03988003730773926
0.0468599796295166
0.04832792282104492
0.05045890808105469
0.0561368465423584
0.05491805076599121
0.05881094932556152
0.27455806732177734
0.07102823257446289
0.07316398620605469
0.0800008773803711
0.08454084396362305
0.08282995223999023
0.09413671493530273
0.10779190063476562
0.10585498809814453
0.1167292594909668
0.12280106544494629
0.12477898597717285
0.1352708339691162
0.14667606353759766
0.15279603004455566
0.15231108665466309
0.26280713081359863
0.1712019443511963
0.18393826484680176
0.19850921630859375
0.20503687858581543

Unfortunately, I don’t think I can do much more to troubleshoot.

@sofroniewn any idea how I could improve performance? Looks like the time to add new layers with add.points seem to significantly increase with the number of existing layers in the viewer (in Napari 0.4.0).

With larger images, the time scales and moves to minutes for adding each layer and ultimately looks like it just froze.

Hi @ajitjohnson, just catching up now. Thanks for the additional detail. I’m able to replicate the main observed slow down with the following minimal code

import numpy as np
import napari
from time import time

data = []
for i in range(50):
    data.append(np.random.random((1000, 2)))

with napari.gui_qt():
    viewer = napari.Viewer()
    for i, points in enumerate(data):
        # add the points
        start = time()
        viewer.add_points(points)
        end = time()
        print(f'run {i} in time {end - start}')

This becomes longer and longer for 0.4.0 becoming unusable, but stays roughly flat for 0.3.4.

I’m reminded of this recent comment from @simocode on using many points layers Visualisation of images and million of dots in napari. You can see some comments from @jni recommending alternative approaches, but I don’t think we had realized that performance was scaling so badly with the number of layers due to a regression.

I’ll investigate further with some profiling and performance monitoring tools and report back.

1 Like

Thank you @sofroniewn

Btw, also any idea why I get this error in windows?

FileNotFoundError: Unable to find an executable to build Qt resources (icons).
Tried: 'pyrcc5.bat', 'pyrcc5', 'pyside2-rcc'.
Please open issue at https://github.com/napari/napari/issues/.

Thank you.

Can you open up a new issue for that on the GitHub repo and only include the relevant info there. Even if something fixed we should add instructions/ comments to our documentation so people aren’t confused.

With regards to add_points I’m able to replecate the slow down only using our ViewerModel and not the full Viewer which means that is purely on the python side (i.e. nothing to do with rendering / vispy). I’ll post more as I figure it out

1 Like

Ok, so I’ve identified the cause of the performance regression as https://github.com/napari/napari/pull/1360, before this PR my minimal example is fast, after my minimal example is slow.

I still need to do some more profiling, but I had some fears that this might be the case. Basically we’re recalculating the layer extent for every single layer every time we add a new single layer because we’re using properties. Instead we should be caching the extent values when they change and updating only when we need. I should be able to make a fix here, and I’m really glad this has been brought to our attention in a real world case!!

Ok, dedicated bug issue made here now https://github.com/napari/napari/issues/1923 and we’ll probably want to continue the conversation there as we start thinking about possible solutions. Thanks for bringing this to our attention!!

Thank you so much @sofroniewn. Thank you for your amazing tool and thank you for your super quick response.

Best,
Ajit Johnson.

1 Like