[05:08:52] <circlicious> in netstat i see 2 mongod instances running on 27017 and 28017
[05:09:13] <circlicious> could it be related to not being able to resolve which one to connect to or something, just owndering. althought i do specify the port in connection string
[05:09:38] <circlicious> although ps ax | grep mongo shows only 1 mongod process, not sure :/
[05:09:58] <crudson> and check --bind_ip not set to something different in mongod command line
[05:10:43] <crudson> for isntance mongod --bind_ip 127.0.0.1 will not allow you to connect from server1
[05:11:31] <circlicious> crudson: in the cxonfig file there is, bind_ip = 127.0.0.1
[05:12:38] <crudson> circlicious: ok, but note that someone put it there for a reason. Check with them to see what your best choice is, as it's probably in place to prevent exactly what you're trying to do for security purposes.
[05:12:52] <crudson> I have to go now, but good luck. I leave this logged in.
[05:14:32] <circlicious> hm bind_ip should just specify the ip on which the mongod will listen on, why would it affect connectons from other server, confued. not sure wutdo :?
[05:32:40] <circlicious> ok i understand that now. hm, i think i nee dto first learn to work with firewalls properly. fine.
[07:31:32] <gigo1980> hi, i have an mapreduce thate emit 200000 times, in the reduce function i make only an simple count, why does it make 200 reduce calls and count each time only 1001 ?
[07:55:12] <kali> gigo1980: that's the way the map reduce is implemented, the spec says that reducers can be stacked on top of each other. this is why their input and output have the same form
[07:55:46] <kali> gigo1980: calling a JS function with an array of 200000 is impractical
[07:56:36] <kali> gigo1980: and the stackable reduce allow to performe a first stage of reduction on the shard node, before doing the final reduction on one single node
[07:58:08] <gigo1980> kali: so there is no way to blow this limitation up ?
[07:58:33] <kali> gigo1980: it's not a limitation, it's a feature
[07:59:23] <gigo1980> do you have an example where something is done ?
[08:01:59] <gigo1980> or the link there is this way of using map reduced is described ?
[08:28:32] <gigo1980> ok … so it only aggreates data on this. no problem i can do this ...
[08:28:42] <NodeX> put it into redis or memcached as chunks and trickle it out
[08:28:52] <kali> gigo1980: so the mapper emitted values must also strictly match the reducable values: in this case, emit("a", { sum: 5, max: 3, min: 1}) somewhere, emit("a", { sum:12, max: 4, min 1 }) somewhere else
[08:28:58] <gigo1980> i ll thry this and give you an reponse
[08:56:10] <NodeX> make sure any upserts that are index bound are indexed properly
[08:56:51] <NodeX> I had a similar problem a year ago with a history collection and couldnt work out why my inserts crawled to a halt until I remembered I needed an index on a user id as it was index bound
[08:59:13] <zeus> for 65 gb data my index is 14gb - is this possible ?
[09:47:38] <zeus> i;ve managed to reduce index to 4.1 gb
[09:47:58] <zeus> now im moving to 8gb machines - i hope it helps
[09:49:18] <NodeX> I would think it's the page faults and the swapping
[09:49:51] <zeus> but i was hoping for litte more than 50 ins/sec from 2cpu 4gbram machine, especialy with ssd drive
[09:50:46] <_johnny> zeus: how did you reduce from 14 to 4?
[09:51:57] <zeus> i got 3 indexes so i droped biggest one
[09:52:10] <zeus> removing _id index would be sufficient
[09:52:12] <NodeX> you should drop every one you dont need
[09:52:16] <_johnny> ah, okay, i thought you had one (two with _id)
[09:52:44] <zeus> i can't i dont use _id index and its prohibited
[09:53:06] <NodeX> you never lookup anything on _id?
[09:54:38] <_johnny> NodeX: after i rebuild a (large) collection recently, i dropped a lot of indexes that i thought i needed (based on a sql layout, where you usually put indexes on what you WHERE key ...), and my lookups seem equally fast as before. does mongo look for the indexed field first, then the other conditions? that would be awesome - and it seems to be what it does
[09:55:32] <NodeX> it (tries to) optimise(s) your query else you can hint
[09:55:56] <NodeX> you need an index for example if query for foo:123, bar:123 you need a compound index on both fields
[09:56:19] <NodeX> you also need to tell the index to be ascending or descending for sorting
[09:56:44] <_johnny> hmm, see that's interesting, because that's exactly what i do in one query, two fields, but only index on one, and it seems (just as) fast
[09:56:48] <NodeX> it will only look on indexed fields if an index exists if that makes sense
[09:57:21] <_johnny> right, but i was thinking in orders. so it finds the index fields first, and only need to run the remaining conditions on those
[09:57:25] <NodeX> if there is not alot of documents then it will be fast
[09:57:51] <NodeX> use explain() on the end of a query and it will tell you what its doing
[09:59:57] <NodeX> you can use compound keys efficiently and plan the query's to use part of the key
[10:00:28] <_johnny> ah, very cool. thanks a lot :)
[10:01:02] <_johnny> it seems to actually do what i hoped. across 2,5mio docs, it looks only through 7. and 7 is the amount that contains the indexed field value
[10:05:21] <gigo1980> kali: i modified it … http://pastebin.com/vNmYyj8m
[10:05:22] <_johnny> yup. i'll do that over the course of an internal "test" i'll hold soon
[10:05:50] <gigo1980> but have the same problem that i have maximum 1000 emits for each reduce..
[10:47:14] <kali> gigo1980: durationcount is not right yet, you need to add the .durationcount from the values array
[10:48:59] <kali> gigo1980: and yes, it will call the reduce for a limited number of emit() but this will not longer be a problem, because you've made it stackable
[10:54:41] <gigo1980> kali: can you give me please a snippet, how my reduce methode should look like
[10:56:55] <kali> gigo1980: just change reduced.durationcount++; by reduced.durationcount += values[i].durationcount
[10:58:04] <kali> gigo1980: the input value of a reducer is either the output of map (in that case durationcount is 1) or the result of a previous reduce (in that case durationcount will be > 1)
[12:03:58] <siert> hey guys; any idea why rpm's on downloads-distro.mongodb.org are not signed?
[13:26:58] <woozly> but no operations... I have deleted big collection (6000000 entries)
[13:56:51] <jmar777> woozly: anything in the logs?
[14:18:35] <augustl> how do I add an "unique" index for a specific key in an array? My documents have authTokens: [{key: "123abc", ...}, {key: "456def", ...}, ...]
[14:19:06] <augustl> uh, in my app it's actually called "token", not "key", in case that looked confusing :)
[14:23:56] <doxavore> Can you not change to hostnames from IPs in a replica set config? https://gist.github.com/af32c5e10b95cbdf16e5
[14:31:07] <seba89> hello, i'm having a problem connection to mongo from php. it throws a MongoConnection exception with message "transport endpoint is not connected".
[14:31:17] <seba89> i can connect normally through the mongo console
[14:31:48] <NodeX> pastebin your connection string
[14:32:05] <remonvv> Hi guys, we keep having issues with movePrimary on a sharded database resulting in the data disappearing until mongos is restarted.
[14:32:11] <remonvv> Anyone any idea what's happening?
[14:34:11] <seba89> NodeX: I'm using the defualt: "mongodb://localhost:27017"
[14:44:37] <dcbiker1029> how do I modify my upstart file in init.d to use a different dbpath it is ignoring /etc/mongodb modifications
[14:51:51] <xico> Bonjour my international partners in development.
[16:08:42] <Cubud_> Does anyone here use MongoDB with C#?
[16:10:41] <Cubud_> I need to know how in C# I would update an individual element in all documents which meet a criteria
[16:29:25] <NodeX> on the shell it's like this Cubud: db.foo.update({foo:'bar'},{$set : {field:'value'}},false,true);
[16:30:32] <marek_> hi; I want to reference two user IDs in a Message object. Should I have a field [id1,id2] or two id fields which I query with an $or?
[16:31:09] <marek_> the Message isn't directional so the the user IDs are interchangeable
[16:47:03] <int> In mongodb I have a field saved as u'timestamp': datetime.datetime(2012, 8, 23, 19, 55, 13, 830000)
[16:47:33] <int> using pymongo I am doing a querty: {'timestamp': {'$lt': u'2012-08-24 16:43:54.017102'}}
[17:02:29] <int> _johnny: I think it's because I am using all this in an api and the filter is being passed a query string
[17:03:18] <_johnny> you can make a datetime object from the string. but check with something similar to what i wrote first - to see if that's the problem
[17:18:05] <Cubud_> NodeX: Yes, I can do that in the console but how would I structure it as a runCommand?
[17:18:16] <Cubud_> That's how the C# API seems to work
[17:36:24] <gigo1980> hi what was the command to move an database to an other shard in my cluster
[18:04:46] <edmundsalvacion> Hi, there I have been noticing that mongo has been skipping records and was wondering if anyone had a bit more insight as to why this may be happening
[18:11:35] <tsally> can someone hepl me understand the relationship (or lack thereof) between indexes and routed queries? does an index just improve performance for certain queries once the query hits the mongod process?
[18:27:37] <edmundsalvacion> kali, what i have noticed is while iterating over, say one million, records in a collection which generally has frequent additions to it, when it reaches the end, it appears as if some records were not returned at all
[18:28:30] <edmundsalvacion> even on a smaller scale, i've noticed that while iterating over 8000 records, only 7000 or so were successfully returned
[18:29:43] <tsally> edmundsalvacion: are you reading as inserting is going on and are you reading from primary or secondary?
[18:30:16] <edmundsalvacion> tsally: yes i am reading as inserting is going on, and from the primary
[18:32:51] <edmundsalvacion> would it be better to be reading from the secondary in this case?
[18:33:55] <kali> edmundsalvacion: you're reading in natural order ? without sort ?
[18:35:18] <kali> i think it's because mongo just parse the whole collection in the disk order. if some records are inserted before where you cursor is, you don't see them
[18:35:33] <kali> edmundsalvacion: it's what i think is happening, i'm not 100% sure
[18:37:02] <kali> edmundsalvacion: in your case, the cursor is on the index on foo
[18:37:24] <edmundsalvacion> its actually a uniq compound index on foo and bar
[18:38:12] <tsally> kali: you are getting at this? "MongoDB cursors do not provide a snapshot: if other write operations occur during the life of your cursor, it is unspecified if your application will see the results of those operations or not." http://www.mongodb.org/display/DOCS/Queries+and+Cursors
[18:38:46] <kali> tsally: ha ? i did not know it was explicit in the doc, but that was mostly my gut feeling
[18:39:07] <tsally> kali: seems like good intuition ^^
[18:39:59] <kali> edmundsalvacion: i have seen case when i parse an entire collection top to bottom, adding some information on the documents. I know in that case i can "see" some records twice (because they no longer fit where they are and are bumped to the end of the collection)
[18:40:58] <kali> edmundsalvacion: if that's an issue, maybe the snapshot can help: http://www.mongodb.org/display/DOCS/How+to+do+Snapshotted+Queries+in+the+Mongo+Database
[18:41:09] <kali> edmundsalvacion: it must comes at a price, I guess
[18:42:07] <kali> it looks that it only fix "my" issue, not yours...
[18:52:26] <edmundsalvacion> kali: i have looked into snapshot mode, but it seems to not be able to use a secondary index
[18:55:33] <edmundsalvacion> so these records are never updated, so i don't see how they could be bumped to the end of the collection
[18:56:28] <edmundsalvacion> while annoying, I'm not overly concerned about dupes as much as i am about records that showing up
[19:16:31] <lorsch> Hi, i'm new to mongo db and i can't find an answer to the following question about mapreduce:
[19:16:31] <lorsch> The key emitted by the map()-function is the _id in the collection wich will be created, also it will act as shard key.
[19:16:31] <lorsch> I can't create an index to the created collection, because if i do so, the mapreduce won't run again...
[19:16:31] <lorsch> Is there no way to have an own index in the created collection?
[19:18:06] <lorsch> i whant to do incremental map reduce with "out: {reduce: "collection"}", but without some indexes on the created collection, it doesn't ake sense at all
[19:32:42] <quuxman> Anybody here use pymongo? (I asked this yesterday, and was surprised to get no responses)
[19:39:30] <dgottlieb> I've used pymongo, mostly just the basic operations
[19:41:43] <quuxman> dgottlieb: the reason I keep asking is I wrote a very small library that's a helper for writing queries with pymongo, and I want some feedback on it
[19:42:51] <dgottlieb> Ah. I can take a look. Link?
[19:44:04] <trupheenix> i am trying to query a list of items stored on mongo db and display them on a web front end. i would like to display the first 20 items. then have a button to display the next 20 items. how do i do this using pymongo? not looking at code, just want a conceptual idea. thanks
[19:45:42] <quuxman> dgottlieb: which produce {'tags_index': {'$all': ('food', 'art')}}, and {'class_name': {'$in': ('Broadcast', 'Star')}, 'created': {'$gt': 1345232669.437062, '$lt': 1345837469.437065}} respectively
[19:46:27] <quuxman> dgottlieb: my main concerns are 1) I have the strong feeling I'm recreating something that already exists, though I can't find it, and 2) is there a better approach?
[19:48:02] <dgottlieb> trupheenix: mongodb has skip and limit functions you can put on queries. So you can do a limit(20) on all queries and a skip on (page_number * 20). I think with pymongo skip and limit are parameters to the find() function
[19:49:04] <quuxman> trupheenix: keep in mind that MongoDB does not have an efficient implementation of skip. It actually iterrates through every previous result in your query
[19:49:32] <trupheenix> quuxman, dgottlieb thanks. any other alternatives you can suggest?
[19:49:41] <quuxman> trupheenix: the only reasonable way to implement this is simply creating an attribute that has your desired order, and use a comparison
[19:50:09] <trupheenix> quuxman, hmmmm then it becomes all the more complex!
[19:50:27] <quuxman> trupheenix: this results in making pagination for any non-trivial (dynamically determined) ordering a humungous pain in the ass
[19:50:35] <trupheenix> quuxman, i have a dynamically changing list
[19:51:12] <quuxman> in my opinion, this is mongo's largest weakness in comparison to mysql and postgres
[19:51:17] <dgottlieb> quuxman: I think it's a cute simple wrapper to simplify those "advanced" operators. I'm sure others have been made for each of the languages.
[19:51:56] <quuxman> dgottlieb: I just get tired of typing brackets, quotes and '$'s
[19:52:20] <quuxman> especially when writing on-the-fly queries for analytics
[19:52:24] <dgottlieb> quuxman: As for better approaches, are you talking about trying to find a single API that everyone would enjoy using and adopt?
[19:52:44] <quuxman> dgottlieb: ideally I guess... or just a more concise sensical way of doing it myself
[19:53:05] <dgottlieb> quuxman: heh, if you think using map/list literals is bad, try turning your above query into Java :)
[19:53:32] <quuxman> dgottlieb: there is a reason I've never coded Java
[19:54:21] <quuxman> at least not for work... I did make a little paint / reaction diffusion exploration program once, just for the experience
[19:54:59] <dgottlieb> quuxman: I think if this is mostly for yourself, if it's good for you, it's good for me. I think it can be difficult to have everyone be happy with a lightweight API such as this.
[19:55:40] <dgottlieb> quuxman: I find that there are a few clear ways to do something like that and none are really any more right or wrong, just a matter of taste
[19:55:51] <quuxman> dgottlieb: mainly what I'm puzzled over is why something equivalent isn't included in pymongo. I understand the desire to keep the API as similar as possible to the underlying mongo API, but that doesn't mean you have to make things a pain in the ass
[19:55:59] <trupheenix> i got an idea? how about i cache the cursor while it iterates?
[19:56:23] <trupheenix> so i see the first 20 results. serve it out. store the cursor.
[19:56:25] <quuxman> trupheenix: if the pages are always requested in sequence, that would be _ok_
[19:56:41] <trupheenix> next call i retrieve the cursor and start off from last position?
[19:56:47] <quuxman> the thing is, there's no guarantee for that, and it makes your app non "restful"
[19:57:15] <dgottlieb> quuxman: Sorry if my feedback isn't very specific, I'm a little ambivalent to lightweight wrappers. I tend to talk with drivers directly
[19:57:38] <trupheenix> quuxman, dgottlieb i think i will have to do with skip and limit then
[19:57:49] <quuxman> dgottlieb: so far I haven't used it in application code (though I may start if I settle on this approach), just interactive use
[19:58:06] <quuxman> trupheenix: it's really not a big deal unless you have huge data sets and a significant amount of traffic
[19:58:24] <quuxman> don't prematurely optimize, it's just good to be aware of that weakness
[19:58:29] <trupheenix> quuxman, could have huge data sets and significant amount of traffic. trying to design a message boards.
[19:59:18] <quuxman> and money, if you have a team, and you're buying hardware
[19:59:55] <dgottlieb> quuxman: well, i think something like that isn't included merely because there's a few consistent ways to implement that and the official drivers would rather not pick for the community and just leave it up to individuals to do what they wish. Also if one driver were to officially support an alternative syntax for advanced queries, the other drivers would probably have to try to conform which isn't always feasible for the officially supported langua
[20:00:42] <quuxman> dgottlieb: What're the other ways of doing it?
[20:02:17] <dgottlieb> quuxman: I think this was one of the original query builders that was adopted into a driver: http://api.mongodb.org/java/2.2/com/mongodb/QueryBuilder.html
[20:03:03] <quuxman> looks pretty similar in concept
[20:03:22] <dgottlieb> quuxman: I believe the conclusion from that experiment was to not make a habit of adopting these in general. But I'm really not an authoritative source on this at all
[20:09:36] <quuxman> dgottlieb: thanks for the feedback :)
[20:10:27] <dgottlieb> quuxman: heh, np. Sorry I couldn't be more helpful.
[20:10:48] <quuxman> although I'm not sure what, I feel like I learned something
[20:11:03] <dgottlieb> that's the most important part really!
[20:13:03] <dgottlieb> Taking a second look, I can say some people (ok, maybe just me) are not a big fan of using argument lists (specifically on the `in` and `all` method)
[20:13:36] <dgottlieb> I personally would prefer just doing the .in('key', ['value1', 'value2'])
[20:14:00] <quuxman> the main point of it is to get rid of brackets. I support both
[20:14:13] <dgottlieb> which I see is still possible, but I would make that mandatory and I understand your impetus to allow the other way :)
[20:14:27] <quuxman> yeah, I thought it was a little weird
[20:14:47] <quuxman> it does make it impossible to search for a nested list in a multikey
[20:15:24] <quuxman> unless you throw a dummy value in the beginning
[20:16:21] <quuxman> in general if a feature makes some theoretical behavior impossible to achieve in your library, it's a sign of bad design, no matter how remote the possibility of actually using that behavior
[21:12:23] <geoffeg> Is there some way, maybe with 2.2's aggregation stuff, to query based on a conditional of two fields in the same document? like db.foo.find({'create-date' : { $gt : 'edit-date' }})?
[22:04:58] <sneezewort> if I create a user in a db does that data get stored in that db?
[22:36:57] <Bartzy> For example, I have a worker that runs for very long time - it instantiates a MongoCollection object on start (PHP driver) and when it gets a job it needs to use update to update some document. If the worker doesn't get a job for a while - the connection will drop ? How do I catch that and "reconnect" ?
[23:41:43] <Neptu> still need to learn quite a lot about mongo myself
[23:42:03] <Neptu> but I have a question regarding geospatial indexing
[23:43:02] <Neptu> I have a like 200GB of records with 2D posiotions and I was wondering if I can use and abuse the collection system to have organized smaller subsets of data to speed up search
[23:43:17] <jrdn> http://blog.attachments.me/post/9712307785/a-fast-simple-queue-built-on-mongodb similar to that
[23:43:22] <jrdn> and similar problem to what we have too