Bug 2545 - performance on std::map
Summary: performance on std::map
Status: NEW
Alias: None
Product: uClibc++
Classification: Unclassified
Component: Standard Compliance (show other bugs)
Version: unspecified
Hardware: All All
: P5 major
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-08 23:57 UTC by Nuno Goncalves
Modified: 2015-04-16 08:13 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:


Attachments
testcase for std::map function find (356 bytes, text/plain)
2011-06-04 14:40 UTC, Oliver Metz
Details
Data (text) taken from Wikipedia (3.36 KB, application/octet-stream)
2011-06-04 14:40 UTC, Oliver Metz
Details
авк (deleted)
2015-04-16 01:56 UTC, Александр
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nuno Goncalves 2010-09-08 23:57:35 UTC
I was updating the nmap package for ubuntu on a MIPS machine and I noticed the performance was unaceptable. I changed to libstdcpp and it improved in the order of 1000 times(1s vs 1000s).

I narrowed the problem to two calls on a std:map variable:

class std::map<port_spec, service_node> service_table.

line 241: i = service_table.find(ps); takes 50% of the time.
line 274: service_table[ps] = sn; takes 45% of the time.

Can I help in any way to find the problem/is this not a real issue?

Regards
Comment 1 Nuno Goncalves 2010-11-28 00:05:19 UTC
(In reply to comment #0)
> I was updating the nmap package for ubuntu on a MIPS machine and I noticed the

I mean openwrt, and not ubuntu.
Comment 2 Bernhard Reutner-Fischer 2010-11-28 00:26:33 UTC
(In reply to comment #1)
> (In reply to comment #0)
> > I was updating the nmap package for ubuntu on a MIPS machine and I noticed the
> 
> I mean openwrt, and not ubuntu.

You could provide a small, self-contained testcase that illustrates the problem. How big is that map? How does it look like?
Comment 3 Nuno Goncalves 2010-11-28 11:44:42 UTC
Sorry, I didn't provide that much information in fact.

The elements of class std::map<port_spec, service_node> service_table are:

struct port_spec {
  int portno; /* Network byte order */
  std::string proto;

  /* Sort in the usual nmap-services order. */
  bool operator<(const port_spec& other) const {
    if (this->portno < other.portno)
      return true;
    else if (this->portno > other.portno)
      return false;
    else
      return this->proto < other.proto;
  }
};

/* This is a servent augmented by a frequency ratio. */
struct service_node : public servent {
public:
  double ratio;
};


It will take up to 65536 elements.

I checked on http://cxx.uclibc.org/faq.html:

A deque needs only 3 pointers and a size parameter (essentially 4 integers) to manage itself. Compare this with at least 2 pointers per item for a tree. After two items, the memory overhead of the map already outweighs that of the deque. Also, deque lookups may be faster as moving to the next level in the search does not require indirection. On the down side, a deque does not resize cleanly like a map does. It will use more memory than needed in buffering, and will occasionally need to copy all of the elements to a new container for more free space. Inserts also execute in O(n) time instead of O(log(n)) time on a deque unless they are to be added to the front or back of the deque, at which point they are inserted in O(1)+ time. Lookups and updates are O(log(n)) on both systems. In the end, I thought that the memory usage of the system outweighed the performance issues involved. 

Does this mean this is a intrinsic performance problem to save memory? I must always use the standard libstdc++?
Comment 4 Bernhard Reutner-Fischer 2010-11-28 12:25:26 UTC
(In reply to comment #3)
> Sorry, I didn't provide that much information in fact.
> 
> The elements of class std::map<port_spec, service_node> service_table are:
> 
> struct port_spec {
>   int portno; /* Network byte order */
>   std::string proto;
> 
>   /* Sort in the usual nmap-services order. */
>   bool operator<(const port_spec& other) const {
>     if (this->portno < other.portno)
>       return true;
>     else if (this->portno > other.portno)
>       return false;
>     else
>       return this->proto < other.proto;

This sounds rather erm elaborate, but ok.

>   }
> };
> 
> /* This is a servent augmented by a frequency ratio. */
> struct service_node : public servent {
> public:
>   double ratio;
> };
> 
> 
> It will take up to 65536 elements.
 
> Does this mean this is a intrinsic performance problem to save memory? I must
> always use the standard libstdc++?

Well, *currently*.
As said, please provide a small, self contained testcase with real world data and we'll see what we can do.
Comment 5 Oliver Metz 2011-06-04 14:39:26 UTC
(In reply to comment #4)
> As said, please provide a small, self contained testcase with real world data
> and we'll see what we can do.
Attached is a small testcase for the function find. I did not try if add/insert is as bad too.

The times were meassured on a 360 MHz mipsel embedded system. The binaries were compiled static without optimization flags.

root@fritz:/var/mod/root# time ./map_test.uClibcxx  < data
You typed 'spoon' 5 time(s)
real    0m 0.73s
user    0m 0.69s
sys     0m 0.05s
root@fritz:/var/mod/root# time ./map_test.libstdcxx < data
You typed 'spoon' 5 time(s)
real    0m 0.02s
user    0m 0.02s
sys     0m 0.01s
Comment 6 Oliver Metz 2011-06-04 14:40:14 UTC
Created attachment 3349 [details]
testcase for std::map function find
Comment 7 Oliver Metz 2011-06-04 14:40:40 UTC
Created attachment 3355 [details]
Data (text) taken from Wikipedia
Comment 8 Oliver Metz 2011-06-04 16:02:52 UTC
With a 180 KB text file the results are more clear:
root@fritz:/var/mod/root# time ./map_test.libstdcxx < data
You typed 'spoon' 5 time(s)

real    0m0.924s
user    0m0.880s
sys     0m0.040s
root@fritz:/var/mod/root# time ./map_test.uClibcxx < data
You typed 'spoon' 5 time(s)

real    14m19.181s
user    9m28.790s
sys     4m43.530s
Comment 9 mw3demo 2013-12-24 11:09:12 UTC
Any update on this? Oliver Metz provided the test case.
Comment 10 mw3demo 2013-12-24 11:11:47 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > Sorry, I didn't provide that much information in fact.
> > 
> > The elements of class std::map<port_spec, service_node> service_table are:
> > 
> > struct port_spec {
> >   int portno; /* Network byte order */
> >   std::string proto;
> > 
> >   /* Sort in the usual nmap-services order. */
> >   bool operator<(const port_spec& other) const {
> >     if (this->portno < other.portno)
> >       return true;
> >     else if (this->portno > other.portno)
> >       return false;
> >     else
> >       return this->proto < other.proto;
> 
> This sounds rather erm elaborate, but ok.
> 
> >   }
> > };
> > 
> > /* This is a servent augmented by a frequency ratio. */
> > struct service_node : public servent {
> > public:
> >   double ratio;
> > };
> > 
> > 
> > It will take up to 65536 elements.
> 
> > Does this mean this is a intrinsic performance problem to save memory? I must
> > always use the standard libstdc++?
> 
> Well, *currently*.
> As said, please provide a small, self contained testcase with real world data
> and we'll see what we can do.

Bernhard Reutner-Fischer: Ping
Comment 11 mw3demo 2013-12-24 11:17:22 UTC
Wow, seems I might have pulled this one back up from the grave. I didn't notice the last post was over 2 years ago.
Comment 12 Александр 2015-04-16 01:56:42 UTC
Created attachment 6001 [details]
авк

Жесть
Comment 13 Bernhard Reutner-Fischer 2015-04-16 08:13:46 UTC
The content of attachment 6001 [details] has been deleted by
    Bernhard Reutner-Fischer <aldot@uclibc.org>
who provided the following reason:

pdf

The token used to delete this attachment was generated at 2015-04-16 08:13:27 UTC.