[05:49:55] <jcims> quick question, i've got a csv with ~16M rows. eacy row has lat/long, and i'd like to merge them into a location field for geospatial searches. Is it possible to run an upsert or simlar query after the import to copy the separate fields into one
[06:27:21] <owen1> my collection have documents like this: {'chains':[{'id': 1}, {'id': 2}]}. i want to know if a given integer exist among all the ids. what's the best way to do that?
[06:28:07] <owen1> so in this case 1 will return true. but 3 false. since 3 doesn't exist in any document.
[06:28:22] <owen1> (there is only 1 document in my example)
[06:32:27] <crudson> owen1: .find({'chains.id':1}) will match your example document
[10:03:00] <gyre007> we have a monitoring set up and we are getting loads of alerts ...like intermittent alerts for replication lag...at the moment we are checking for the number of seconds....30s and we are raising an alert...but then in next check the alert is resolved...really annoying :)
[10:15:04] <oskie> hello, if Mongo's sharding is automatic, why do you need to specify the key to shard on?
[10:35:57] <NodeX> oskie : because it doesn't make assumptions on your data
[12:06:25] <oskie> but one other question: where does the shard balancing occur? in mongod processes i presume? but what process initiates the balancing?
[14:01:09] <listerine> Hi, is there a command to dump a document and recursively it's references/dependencies?
[14:01:59] <wereHamster> listerine: that depends on the driver, The driver resolves the references, not mongodb
[14:05:54] <listerine> wereHamster, got it .. I was thinking to do it using mongodump or something like that, but instead I must do using some mongo driver .. is there a scripting language that you recommend to sync just a given document in two environments?
[14:07:17] <listerine> wereHamster, got it .. I was thinking to do it using mongodump or something like that, but instead I must do using some mongo driver .. is there a scripting language that you recommend to sync just a given document in two environments?
[14:07:22] <wereHamster> get a driver for your favourite language and code it yourself
[14:12:06] <listerine> wereHamster, but what I need is some kind of a dump because I need to transfer that document between environments .. if I do that in my favourite language I'll have the document in-memory after querying and need to send it to the target environment. Would I do some service to receive this document on the target environment and perform some kind of merge operation with the coming document and the document in the target environment?
[14:12:57] <listerine> sorry if the question was confusing, I can reformulate it heh
[14:13:43] <wereHamster> open a conneciton to the first database, read the documents from it, open a connection to the second database, write the documents there.
[14:14:27] <wereHamster> you will have to write the merge algorithm yourself anyway, mongodb can not know how to do that.
[15:11:37] <remonvv> People need to stop with blogs like these; http://t.co/869JZ7TV
[15:12:27] <remonvv> "Use indexes" isn't a design and performance tip. It's like saying "Add Wheels" is a performance tip for making cars fast.
[15:12:33] <remonvv> Bit more in depth discussion would be good.
[15:14:33] <remonvv> I was looking forward to my end of the day blog read but @MongoDB failed me by not filtering content properly.
[15:14:48] <NodeX> this is the trouble with a product being both simple and good
[15:15:09] <NodeX> to many idiots think "Oh Perhaps I can do XYZ with mongo" and then these posts pop up everywhere
[15:16:04] <NodeX> I am sure *SQL was fast at one point until the majority got it's hands on it and asked for foo, bar, baz to be implemented or John hancock wrote a blog post about how to do things (which he really didn't know about)
[15:19:48] <remonvv> Well I certainly don't mind blogs about MongoDB indexes, but this is sort of "Yeah it's faster now. Not explaining why that is, not explaining how to make appropriate indexes for specific use cases, not explaining why you want to right balance your indexes.....but there it is. Indexes are awesome!"
[15:20:23] <remonvv> It would even be okay if this was part 1 in a multi part blog on indexes in MongoDB.
[15:23:34] <NodeX> apparently (frmo what they told my business partner) they've been watching my progress on a project for a long time and would like me to blog about the trials and tribulations
[15:24:39] <remonvv> I'm not a writer by any stretch of the imagination.
[15:25:06] <NodeX> I would certainly like to put some blogs out there with the "right" way to do things and the reasons for them
[15:25:47] <remonvv> But it's the small things that nobody blogs about. You know, in what order does MongoDB put documents if you sort them on field A where values for A are [1,2,8] and [0,1,9] if you sort ascending and descending.
[15:26:28] <remonvv> Or what are good production settings for drivers. Or how do you make sure only small parts of your indexes have to be kept in memory.
[15:26:42] <remonvv> In other words; "why isn't my shit working the way i expect it to"
[15:26:46] <NodeX> I'm not an "internals" person, I try a few ways and see which one works fastest - I dont often care why it does lol
[15:27:15] <NodeX> these are all good topics that should be addressed
[15:27:21] <remonvv> Fair enough but you'd be surprised how many SO questions are due to a lack of understanding of how b-trees work, or binary compares work, or how sharding works internally, etc.
[15:27:40] <remonvv> And then there's schema design
[15:27:40] <NodeX> perhaps if people had a primer on Mongo like that then alot of their initial design could be thought out properly
[15:28:00] <remonvv> Why shouldn't you use values as field names, when to embed, when to denormalize, etc.
[15:28:12] <remonvv> Nobody goes beyond "Well say you're creating a blogging website"
[15:28:12] <NodeX> most -Mongo- related problems can be solved with better schema design imo
[15:28:18] <remonvv> Yes yes we know...blogging is awesome in MongoDB.
[15:35:44] <remonvv> I checked it out and there's very little documentation, the installer crashed, when I finally got it working it was stupidly slow and basically the worst of both worlds (NoSQL <-> RDBMS).
[15:39:44] <NodeX> https://github.com/richardwilly98/elasticsearch-river-mongodb <---- for es
[15:42:30] <remonvv> That's actually an interesting point. I'm working on a 100% MongoDB message queue but I was having some issues with multiple tailable cursors on the same collection. I'm assuming the solution above has to do that as well.
[16:33:18] <t0ms> hi, is there any way how to build HA cluster of mongodb on 2 physical nodes?
[16:45:45] <jcims> if i'm going to be running a 10gb database from a system with just a gig of memory, does it make any sense to move indexed attributes to a separate collection and link them back to the main collection by id?
[16:55:47] <jcims> it's got to be better than what i'm getting now :)
[17:04:05] <Almindor> how do you specify date query to mongodump?
[17:04:05] <Almindor> I always used var s = new Date in the console, but how do you do it for mongodump?
[17:07:13] <shadow_fox> cant seem to start mongodb from my linuxmint OS
[17:07:26] <shadow_fox> help needed someone please guide me
[17:08:18] <shadow_fox> i have downloaded the 32bit linux tgz file and extracted it and renamed to mongodb and moved the whole extracted package to /usr/local/bin/
[17:08:34] <shadow_fox> also created /data/db/ and set it to user read write
[17:17:24] <shadow_fox> wereHamster: that worked, now its up and running, could you please tell me how to include the whole path so that i won't have to write the whole path again and again if you don't mind
[17:17:42] <jcims> i do have rack space available for fairly cheap, i'd probably just colocate it if i wanted to crank up the memory
[17:17:46] <NodeX> wereHamster : how do you rate the hardware from them ? ... we're a bit dubious due to it not being enterprise level
[17:20:39] <wereHamster> well, you get very detailed specification of the hardware. The exact CPU model, whether the memory is ECC or not, whether the disk is enterprise class or not. And you can always do your own benchmark.
[17:20:58] <NodeX> In your experience with them are they reliable?
[17:22:03] <wereHamster> I can't tell. Disks always go bad, even on EC2 machines. But there you don't see it because it's all managed by amazon.
[17:22:22] <NodeX> I would never use amazon in production
[17:23:00] <wereHamster> I've had a few disks go bad on my OVH machine. But a support ticket was all that was needed to have the disk changed.
[17:23:41] <NodeX> we use redstation currently but their support is terrible
[17:39:19] <bindr> So I managed to put up upgrading to 2.2.0 until yesterday and immediately became afflicted by https://jira.mongodb.org/browse/SERVER-6909 which appears to be impacting one of my applications. I see from https://jira.mongodb.org/browse/SERVER/fixforversion/11494 that 2.2.1 with the fix for that was to go out yesterday. Is there an updated ETA for release 2.2.1? Is there a workaround I can put in place until then short of having the devs adapt their
[17:39:19] <bindr> code to better handle the error or not provoke it?
[17:56:48] <pingboy> can someone tell me how to make bson_ext ( presumably ) or the mongo ruby driver not convert my long int into a NumberLong() when an insert takes place?
[17:57:12] <pingboy> mongo ruby driver is 1.7.0 and bson_ext is 1.7.0
[17:57:28] <pingboy> my nodejs app writes longs into the same database and they appear like:
[17:57:58] <pingboy> which makes me want to go out to the pet store and buy a puppy so i can kick it in the balls
[18:06:51] <hillct> Good afternoon all. I'm trying to work out the proper syntax for $addToSet where I have an array element in my target document photographers.checkins=[12,5,24] but I can't seem to get the syntax correct. Can someone advise me as to the proper syntax?
[18:08:45] <awc7371> whats the best db viewer for linux?
[18:08:45] <hillct> where this is an update being performed using the findAndModify method
[18:29:36] <pingboy> my kingdom for an answer to this NumberLong() issue.
[18:29:36] <Aartsie> how is it possible that my clean MongoDB is 3,2 GB ?
[18:41:26] <Aartsie> does anyone know why my bluid is 3,2 GB ?
[18:51:27] <awc7371> anyone use an elasticsearch river?
[18:52:07] <_m> awc7371: We do not as it indexes more than we require. But maybe I can help, what's up?
[18:52:38] <awc7371> just wondering if this is a stable / supported solution: https://github.com/richardwilly98/elasticsearch-river-mongodb
[18:53:16] <awc7371> i'm looking at using Lithium PHP, because of native mongo support, and the doc mentions an ES plugin, but I don't see one
[18:53:47] <awc7371> I ask if its stable / supported / a good idea, because I work at a large company and it's going to be a tough sell to my boss
[18:54:18] <_m> That's the one we tried. Seemed stable enough, but it didn't really fit our use-case
[18:54:54] <_m> As for the Lithium PHP… I have no idea.
[18:55:16] <awc7371> do you just use mongo to search? and were you basically duplicating data stored in mongo and es?
[18:57:15] <_m> Our entire dataset is Mongo. However, we only wanted to index specific fields/etc. Rolled our own minimal driver and interface to said driver
[18:57:15] <_m> We also tried tire, which hurt more than it should have.
[19:07:36] <_m> At Grooveshark, we used sphinx and the mongodb official php driver. Beyond that though, I have no idea which PHP extensions would be bets.
[19:08:24] <awc7371> well there is Elastica, a very popular PHP es integration
[19:08:32] <awc7371> i wonder if it would be better to use that than a river
[19:10:45] <pingboy> anyone know how to get the mongo ruby driver and bson_ext to *not* convert stuff to NumberLong()
[19:11:07] <pingboy> like a raw mode or something?
[19:13:26] <_m> awc7371: What ended up being our "best solution" was to insert/update indexes on record updates via a simple driver. This allowed more flexibility and less overheard for indexing in general.
[19:15:57] <_m> The driver handles that as we remove records.
[19:16:56] <awc7371> nice, Elastica probably would provide all that for me
[19:17:05] <awc7371> with a river, i imagine deleting WOULD be a problem
[19:18:40] <awc7371> so you pretty much just CRUD everything twice?
[19:28:24] <hillct> What goes on here? https://jira.mongodb.org/browse/SERVER-831 Major issues go unassigned for two years?
[20:06:40] <oskie> hillct: looks like a major *feature request* to me
[20:07:53] <hillct> oskie: ok, you could look at it that way, or recognize that this sort of query is an obvious usage so the lack of the feature is itself a bug
[20:08:26] <hillct> It's like saying "SQL doesn't support spending strings fields"
[20:39:27] <Guest__> How can I be sure that an application is connecting to my mongodb database? I've got "mongod" up and running, but never see any text output to it, even after failing to login to the application (I would assume it would show failed logged in attempts, since they access the database)
[20:46:03] <Guest__> Hmm, cause the app is configured to use pow.cx, so I guess it isn't connecting then? I'm so confused, haha.
[20:48:36] <_m> Guest__: Write a very simple endpoint to test this. If you're using rails, fire up the console and try to select/save/update a record.
[20:49:56] <AndreKR> Can anyone explain this to me...
[20:50:08] <AndreKR> On http://www.mongodb.org/display/DOCS/Schema+Design it says: "If finding the most recent posts is a very frequent query, we would then shard on the when field."
[20:50:52] <AndreKR> Doesn't that mean that all the "recent posts" queries end up on one shard, so all the load is on one node?
[20:51:21] <AndreKR> Why is "when" a good shard key then?
[20:53:14] <_m> Sort order/indexing for time-based selections
[21:48:56] <owen1> Error: couldn't connect to server 127.0.0.1:27017 src/mongo/shell/mongo.js:91
[22:49:44] <owen1> mgriffin: i don't see mongod inside init.d
[22:50:16] <owen1> what is that initscript-9.03.31-2 ...
[22:56:00] <mgriffin> btw, the box i ran that on is using upstart and does have the service command..
[22:56:13] <chiel> hi all. i'm using the ruby driver to query mongodb and i was wondering how i would filter based on a property that's not at root level. i assume this is possible?
[22:56:16] <mgriffin> owen1: you seem to have a strange configuration.
[22:56:33] <mgriffin> owen1: i would perhaps suggest asking #centos for help
[22:57:26] <owen1> they actualy don't deal with mongo
[22:57:35] <mgriffin> you are not asking mongo questions
[22:58:02] <mgriffin> another repo they might be more familiar with is EPEL http://dl.fedoraproject.org/pub/epel/6/x86_64/repoview/mongodb-server.html
[22:58:10] <mgriffin> but i dont think the repo is your problem.
[22:59:46] <mgriffin> owen1: actually, do you have a /etc/rc.d/init.d/mongod ?
[23:00:05] <mgriffin> if so, then it was just not added to chkconfig
[23:00:21] <mgriffin> so chkconfig --add mongod should create /etc/init.d/mongod
[23:00:29] <mgriffin> as to the missing service command, no idea.
[23:01:10] <owen1> mgriffin: i don't have the mongod file
[23:01:28] <mgriffin> then you didnt install the packages?
[23:02:27] <mgriffin> im surprised mongo docs dont use something simpler like rpm -Uvh http://downloads-distro.mongodb.org/repo/redhat/os/mongo10gen-release.rpm && yum install mongo-10gen mongo-10gen-server
[23:03:00] <mgriffin> owen1: do rpm -q mongo-10gen-server
[23:28:03] <_m> You're welcome! Hopefully that points you in the right direction.
[23:42:25] <kisielk_> does mongodb handle lots of connections coming and going well? or is it better to write a service that connects to it once and send data there?
[23:44:38] <bindr> So in the unlikely event anyone noticed my earlier question about 2.2.1, I resolved my issue with the findAndModify error in 2.2.0 by swapping in the 2.2.1-rc0 binaries from dl.mongodb.org and that seems to have been fine. That should hold me until there's an actual release of 2.2.1
[23:46:51] <_m> kisielk_: I've found it better to use a persistent connection.
[23:47:36] <kisielk_> I have a batch processing cluster, it can run about 400 jobs simultaneously, and each one will need to send its data to mongo
[23:47:47] <kisielk_> so I guess it would be better to funnel them through some central point?
[23:49:53] <_m> We use one connection per machine.
[23:53:47] <kisielk_> _m: that's a good idea actually, I can run a daemon on each node that will forward the files