PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 2nd of March, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:18:01] <obeardly> boutell: if you're still here, I'm fairly certain at this point it's not a MongoDB issue at all, but a supervisord issue.
[00:18:14] <boutell> cool
[00:19:10] <obeardly> but thanks for the offer to help
[03:05:08] <MLM> Why are the objects passed by from a Mongoose find query not working with `.hasOwnProperty(key)`?
[03:06:16] <MLM> console.log(user.hasOwnProperty('username'), user['username'], user);
[03:06:21] <MLM> results in
[03:06:24] <MLM> false 'mlm' { _id: 54effc576676066c20293010, username: 'mlm', __v: 0 }
[03:12:30] <MLM> Full barebones snippet: http://pastebin.com/bYm0qdSf
[03:17:12] <MLM> Fully reproducible snippet with model: http://pastebin.com/E49Ar5Ry
[04:09:48] <aliasc> Hello
[04:10:17] <joannac> hi
[04:10:50] <aliasc> whats best embedded documents or normalized
[04:10:58] <joannac> depends
[04:12:36] <aliasc> yea well i still dont get whats the best way to structure data in mongodb
[04:12:55] <aliasc> i always end up linking documents with ids in normalized way
[04:13:09] <aliasc> just like relationship models
[04:13:13] <preaction> the best way is the one that maximizes storage efficiency and makes querying as simple as possible
[04:14:30] <aliasc> suposse i have a channel with thousands of videos
[04:14:42] <aliasc> and a Channels collection with my channel inside
[04:15:03] <aliasc> i cant embed videos as arrays in the document since the document would grow
[04:15:09] <aliasc> so large it will exceed the limit
[04:16:24] <aliasc> so i need sepparate Videos Collection
[04:17:02] <aliasc> and link documents with ids, i think this is not the way mongodb is meant to work right ?
[04:19:23] <preaction> for that exact situation, it sounds like that is what you want
[04:19:44] <preaction> but if videos have comments, you could make those as an array of inner documents
[04:20:25] <aliasc> i've just read an interesting answer in stackoverflow
[04:20:53] <aliasc> embedding is good for small documents that dont grow fast and do not change regularly
[04:21:31] <aliasc> actually for my project i dont perform too much actions on videos
[04:22:32] <aliasc> its not even a video sharing project, im using mongodb as cache db to compare and control some data
[04:22:43] <aliasc> on channels we own on youtube through youtube-api
[04:24:07] <aliasc> how to insert documents if not exist with batchinsert
[04:40:54] <MacWinner> HI, with gridFS or with the gridfs files collection, is there a way I can find all files that are under a prefix? like filename: "/mypath/subdir/*"
[04:41:18] <MacWinner> i have all my files stored in gridfs with a filename key that corresponds to a linux file path
[04:49:31] <joannac> MacWinner: isn't that just a regex search?
[04:49:41] <MacWinner> yeah.. just got it working
[04:49:58] <MacWinner> didn't know if there was some special gridfs search
[04:50:09] <MacWinner> mongofiles seems to have a commandline switch for it
[05:07:55] <MacWinner> running into weird issue with php driver. where findOne() returns a document.. but find() does not with the same exact query: eg: $files = $collection->findOne(["md5" => "d41d8cd98f00b204e9800998ecf8427e"]);
[05:08:08] <MacWinner> if I replace findOne, with just find, i don't get any results back
[05:09:18] <MacWinner> oh. nm. brain fart
[05:13:58] <zivix> Array?
[05:54:22] <guest999> hi there. relative newbie, but is there a way to store > 4BM in a document ?
[05:56:38] <joannac> guest999: yes? documents are limited to 16MB
[05:58:14] <guest999> thanks. just starting to get into Document -oriented databases (i come from relational). im a developer
[07:13:58] <daidoji> hello I'm trying to do a bulk import from the first method on this page http://api.mongodb.org/python/current/examples/bulk.html
[07:14:08] <daidoji> and I can get the example to work with xrange
[07:14:18] <daidoji> but I'm having trouble getting it to work with a generator I define by hand
[07:14:22] <daidoji> anybody know whats up?
[07:17:00] <daidoji> I can do it when I pass it a list though...
[07:18:40] <joannac> what's the error?
[07:19:20] <daidoji> it gives no error
[07:19:24] <daidoji> it just inserts one record
[07:19:39] <daidoji> like it works fine on the xrange example though
[07:19:44] <daidoji> here I'll gist my code
[07:20:31] <daidoji> https://gist.github.com/daidoji/d85c62344cbeea59b045
[07:21:48] <daidoji> I am on version 2.7.2 of pymongo though
[07:23:54] <joannac> are you sure it's coming out in the right format?
[07:24:57] <daidoji> it comes out as a dict in the debugger
[07:28:19] <daidoji> and if I print it it comes out fine
[07:28:25] <daidoji> just seems to have this problem in pymongo
[07:29:48] <daidoji> oh well, its time for sleep at this point but its frustrating when something that should work according to docs doesn't :-(
[07:32:02] <daidoji> but yeah it seems to be limited to generators, if I instantiate the list it seems to work
[07:32:53] <daidoji> whoops no it doesn't...
[07:32:59] <daidoji> oh well, sleep time I guess
[07:35:41] <daidoji> oh wait I see
[07:36:03] <daidoji> continue_on_error = False just cancels the insert on error
[07:36:07] <daidoji> thats silly
[07:36:50] <daidoji> or at least different from the way the initialize_ordered_bulk_op() works
[07:36:56] <daidoji> well good to know I figured it out
[07:53:06] <errakeshpd> why mongodb ?
[08:01:52] <preaction> why not
[08:02:07] <tylerdmace> this is getting pretty heavy
[08:39:40] <morenoh149> errakeshpd: because sqlite gets sad
[08:40:18] <errakeshpd> ok and is this mongo is realy cuited for rails ?
[08:40:37] <errakeshpd> sorry "suited " not "cuited"
[08:41:57] <morenoh149> it's very cuited for rails
[10:20:12] <amitprakash> Hi, while attempting to create a user for a db, i am getting a No role named userAdminAnyDatabase@database
[10:20:15] <amitprakash> what gives?
[10:27:46] <joannac> amitprakash: you're specifying the role incorrectly
[10:28:06] <amitprakash> joannac, what role should I be specifying? mongo docs tell me this is one of the default roles
[10:29:24] <joannac> amitprakash: how are you creating the user?
[10:30:02] <amitprakash> db.createUser({user: "username", pwd: "password", roles: [{role: "userAdminAnyDatabase", db: "application_db"}]})
[10:30:52] <joannac> " and it's only valid against admin
[10:31:26] <amitprakash> what?
[10:31:27] <joannac> the default role is called "userAdminAnyDatabase" and it's only valid against admin
[10:31:31] <amitprakash> oh
[10:32:34] <joannac> http://docs.mongodb.org/manual/reference/built-in-roles/#all-database-roles
[11:17:40] <pamp> hi
[11:18:17] <pamp> someone has already had a problem with index creation
[11:18:27] <pamp> when the index is too large to index
[11:18:28] <pamp> ?
[11:19:17] <joannac> huh?
[11:19:27] <pamp> http://dpaste.com/2039C5K
[11:19:42] <pamp> my collection for example
[11:20:01] <pamp> i need to creat index on fields k and v in the pros array
[11:20:18] <Lujeni> pamp, your index is not fully in memory?
[11:20:31] <pamp> but i get the error "key too large to index"
[11:20:47] <bowlingx1> hey, I have a question to a gist: https://gist.github.com/BowlingX/2b5e4420f1da73decb4d#file-gistfile1-js-L14-L16. I have lots of Data that I'm going to iterate with this script (for a migration).
[11:20:50] <pamp> I doont know, i cant create :/
[11:21:09] <joannac> pamp: I have no idea how that could happen on that data
[11:21:28] <bowlingx1> I have 34 documents in the given collection, but for only 4 is the "update" callback method called
[11:21:33] <joannac> if your fields k or v are ever > 1kb, mongodb won't let you index it
[11:22:10] <bowlingx1> but everything is written correctly…Only my futures are not resolved correctly (line 25) because not all callbacks are called
[11:22:11] <pamp> i can create index on field "k", but can't on field "v"
[11:22:45] <joannac> pamp: is the value for the "v" field ever > 1kb?
[11:25:51] <pamp> i have amillions of record...and in few cases v is an array with a lot of properties
[11:25:59] <pamp> and yes may can have more than 1kb
[11:26:45] <joannac> right. there's your answer
[11:27:54] <joannac> also, if your v could be an array, you need to rethink your schema
[11:43:24] <pamp> how can I know the size of a particular field in a specific document??
[12:11:04] <Waheedi> Why would mongodb crash on this. http://pastie.org/9993441
[12:11:24] <Waheedi> i understand something fishy is happening here but it should not crash
[12:24:01] <rosenrot87> I do have a question about a weired behaviour of the MongoDB Driver for C#, it works flawlessly if I do not use the replSet option. Once activated this option, I get a connection refused from server. What do I miss?
[12:26:43] <StephenLynx> i think it is for replica sets.
[12:26:48] <StephenLynx> do you have a replica set?
[12:27:48] <rosenrot87> I use the option replSet=rs0 in the mongod.conf
[12:28:10] <rosenrot87> i also use rs.init() and there is a oplog.rs within the local database
[12:28:41] <rosenrot87> if i comment the replset=rs0 within the conf file it works
[12:30:19] <rosenrot87> I use the oplog.rs to get notified if there are changes within my database. This is the only reason why i use replica sets
[12:30:55] <Derick> what's the full error that you get?
[12:34:15] <rosenrot87> Exception:Caught: "No connection could be made because the target machine actively refused it" (System.Net.Sockets.SocketException) A System.Net.Sockets.SocketException was caught: "No connection could be made because the target machine actively refused it"
[12:34:57] <Derick> and what's in your mongodb log?
[12:35:02] <Derick> what's the connection string that you used?
[12:35:33] <rosenrot87> connectionstring = mongodb://user:pass@111.111.111.111:27017/db
[12:36:20] <Derick> are you using the same IP address inside your replicaset config?
[12:36:25] <Derick> rs.status() should show it
[12:39:11] <rosenrot87> this is the log: http://pastie.org/9993497
[12:41:15] <Waheedi> how much would this take usually? db.terms.remove(d._id)
[12:41:32] <Waheedi> time*
[12:41:35] <rosenrot87> Here is the output of rs.status() with the log: http://pastie.org/9993500
[12:42:03] <Waheedi> rosenrot87: you don't have any members in that status
[12:43:12] <rosenrot87> Waheedi: I want to access the database as before from a c# application. Do I need to specify a member?
[12:43:51] <Waheedi> if you want to have a replica set you need to add members to the replica set :)
[12:44:08] <Waheedi> unless you don't want to
[12:44:29] <rosenrot87> Waheedi: But then I need to specify the IP of my computer, which is random?
[12:45:20] <joannac> Waheedi: why shouldn;t it crash?
[12:45:39] <joannac> it needs to keep up to date and it can't.
[12:46:06] <Waheedi> because it is a software joannac. it should notify me it can't and act as a read only or something
[12:46:12] <Waheedi> but not to crash joannac
[12:46:20] <joannac> Waheedi: read only for what? it's behind
[12:46:31] <joannac> and only going to get more and more behind
[12:46:55] <Waheedi> in the first place it should not be behind :)
[12:47:10] <joannac> rosenrot87: can you connect to the mongod from the mongo shell, where you app server is?
[12:47:30] <joannac> Waheedi: erm okay. it failed a unique index. what do you expect it to do?
[12:47:41] <Waheedi> not to crash indeed
[12:47:45] <joannac> why?
[12:47:47] <Waheedi> handle that shit
[12:47:54] <joannac> what benefit would you get from it staying up?
[12:48:04] <Waheedi> it does not affect the replica set
[12:48:12] <joannac> it can't take writes. it had stale data so you probably don't want to read
[12:48:25] <rosenrot87> joannac: I can connect from everyelse, even from my shell on my client without any problem
[12:48:28] <Waheedi> no i want to read actually joannac
[12:48:44] <GothAlice> Waheedi: Except that it detected that its data is wrong. Unless you _want_ wrong data coming back in answer to queries, you pretty much want it to "crash".
[12:48:47] <joannac> Waheedi: you want to read from a member that's behind? why?
[12:49:23] <rosenrot87> joannac: it is only the mongodb driver which tells me connection refused. this is why I'm here :)
[12:49:29] <guest999> hi. im using the windows dirver for mongoDB, but cannot resolve the 'Mongo' class in Visual studio. I have referenced MongoDB.Driver.dll and MongoDB.Bson.dll and have the necesary 'using' clauses
[12:49:49] <GothAlice> Simply have automation to detect removal of a member which spins up a replacement. High-availability means you need monitoring and automatic resolution of certain issues.
[12:50:10] <guest999> ... i have the Mongo system up and running
[12:50:11] <rosenrot87> guest999: Did you use nuget to install it? is your .net version > 4?
[12:50:23] <joannac> rosenrot87: weird. you can connect to it using the same user/password/IP?
[12:50:31] <rosenrot87> yes
[12:51:04] <Waheedi> alright! very convincing
[12:51:17] <rosenrot87> joannac: I can connect from my windows gui app, my windows shell, I did not change anything on the client side, only the replset within the conf file on the server running mongodb
[12:51:24] <guest999> rosenrot87: yes. .Net 4.5 and yes using Nuget. The dll files don't seem to have a class called 'Mongo' anywhere that i can see !?!
[12:51:48] <joannac> rosenrot87: that's very weird.
[12:52:13] <rosenrot87> guest999: what about MongoServer,MongoClient?
[12:52:43] <rosenrot87> joannac: if i remove the flag from the config I can connect again....any ideas?
[12:52:58] <guest999> rosenrot87: yes i can see thoes. the tutorials im reading refer to 'Mongo.Connect()'
[12:53:30] <rosenrot87> guest999: look for another tutorial, there were a lot of changes. keep to the latest ones 2.6.8
[12:54:29] <joannac> rosenrot87: what version of driver?
[12:54:32] <boutell> MLM: maybe Mongoose is using getters and setters (if you’re still here…)
[12:54:51] <boutell> they might not be actual properties
[12:55:02] <rosenrot87> guest999: I also experienced this. Even the tutorial of the MongoDB Driver website says you should use client.GetServer(), but this is already deprecated
[12:55:10] <rosenrot87> 2.0.0 beta2
[12:55:29] <guest999> rosenrot87: ah! ok cheers. im using Mongo 2.6.8
[12:55:29] <joannac> why are you using a beta driver?
[12:56:12] <guest999> rosenrot87: oh dear. i guess i'll try and work it out somehow.
[12:56:20] <rosenrot87> joannac: because i just started with mongodb and thought for something basic like connecting to a database even a beta driver should work
[12:56:45] <GothAlice> "Beta" and "should work" are a strange combination.
[12:57:18] <rosenrot87> I mean I do not want to do fancy stuff. Connecting to a database should work in a beta, at least from my point of view :)
[12:57:26] <rosenrot87> otherwise it should be called alpha :)
[12:58:15] <rosenrot87> Maybe there is one of the developers here which could verify if this is a bug maybe?
[12:59:02] <GothAlice> Both represent "pre-release, use at own risk". Neither should be assumed to do anything other than explode at the least opportune time.
[12:59:09] <joannac> I would try a non-beta version and see if that works
[12:59:23] <GothAlice> Indeed. When in doubt, avoid experimental code.
[12:59:33] <joannac> if it does, then I would consider filing a bug report
[12:59:47] <joannac> however my money is on a configuration problem
[13:01:14] <joannac> rosenrot87: I would also like to see the output of the working cases i.e. windows gui, mongo shell
[13:01:21] <rosenrot87> joannac: which configuration problem?
[13:02:24] <joannac> rosenrot87: I don't know, that's why I'm asking you to test
[13:02:34] <joannac> and asking for more data
[13:03:07] <joannac> i highly doubt a driver would have gone out not being able to connect to a mongod, like you said it's a pretty basic feature :)
[13:03:58] <joannac> so I'm asking you for more data to independently verify what you said
[13:05:49] <guest999> all tutorials i see tghem use the statement: var m = MonoDB(); but in MongoDb 2.6.8 using driver 1.10, the website says http://docs.mongodb.org/ecosystem/drivers/csharp/ its ok to do it, but i cannot see this 'MongoDB' class.
[13:06:19] <Waheedi> so if this db.terms.remove(d._id) is taking more than 40 minutes to execute that means there is definitely something wrong in there right?
[13:06:40] <guest999> is there a list or something showing the API and/or whats deprecated ? (this is all very confusing for me)
[13:06:50] <rosenrot87> joannac: both the shell and the gui application work right now...i can use them es usual
[13:07:19] <joannac> Waheedi: is that one _id?
[13:07:26] <joannac> rosenrot87: gist please
[13:07:54] <joannac> rosenrot87: also ouput of `telnet IPaddress 27017`
[13:08:07] <rosenrot87> joannac: now I reverted to the stable driver and I get the error: "No such host is known"
[13:08:27] <joannac> rosenrot87: erm, okay. so does the IP resolve or not?
[13:08:42] <Waheedi> joannac: YES
[13:08:58] <joannac> Waheedi: what does db.currentOp(true) say?
[13:09:45] <Waheedi> many things joannac
[13:11:07] <rosenrot87> joannac: sure, this is just the case if the replset is set and within my c# app
[13:11:16] <Waheedi> joannac: http://pastie.org/9993561
[13:11:49] <joannac> numYields : 1200
[13:11:57] <joannac> what is going on in your system?
[13:12:10] <Waheedi> joannac: should i really tell you
[13:13:04] <Waheedi> joannac: I'm using a hosting company famous one :) and few cloud block storage volumes lost connection. and this machine was a primary node in replset
[13:13:55] <Waheedi> another node got elected to be the primary
[13:14:29] <Waheedi> but unfortunately that node was really outdated and it didn't shutdown or crash
[13:14:52] <Waheedi> while it was 1 month outdated*
[13:15:14] <joannac> what version of mongod?
[13:15:39] <Waheedi> db version v2.4.10
[13:18:30] <Waheedi> btw the db.terms.remove("id") didn't finsih
[13:19:18] <GothAlice> Considering "id" isn't a query, I'm not sure what you'd even expect that to do. :/
[13:19:28] <joannac> yeah, what GothAlice said
[13:20:28] <joannac> Waheedi: what's id?
[13:20:50] <Waheedi> GothAlice: you are very smart dude
[13:21:42] <Waheedi> its ObjectId("54f0e60931326c1e3d000d6b")
[13:22:13] <joannac> that's also not a valid query
[13:22:28] <joannac> db.terms.remove({_id: ObjectId("54f0e60931326c1e3d000d6b")})
[13:22:34] <joannac> ^^ that's a valid query
[13:23:41] <GothAlice> Waheedi: I'm certainly something, but classification appears to elude you. You seem very obstinate for one requesting assistance; writing p-code describing what's going on is far less useful than providing logs demonstrating without modification or the need for assumptions what is _actually_ going on.
[13:24:14] <rosenrot87> joannac: I can connect from the shell when replset is OFF and ON, onces it responses with ">" and the other time with "rs0:primary>" :) Like it should
[13:24:53] <joannac> rosenrot87: yes, I want to see the output, including what options you give the shell
[13:25:01] <Waheedi> alright
[13:25:20] <Waheedi> sounds promising
[13:27:38] <rosenrot87> joannac: mongo.exe -u mongo -p pass 111.111.111.111:27017/db MongoDB shell version: 2.6.8 connecting to: 111.111.111.111:27017/db rs0:PRIMARY>
[13:28:07] <Waheedi> Do you guys really tried this db.terms.remove(ObjectId("54f0e60931326c1e3d000d6b"))??
[13:28:56] <GothAlice> Waheedi: http://docs.mongodb.org/manual/reference/method/db.collection.remove/
[13:28:59] <cheeser> remove() takes a json document.
[13:29:55] <Waheedi> then something is broken
[13:32:20] <Waheedi> http://pastie.org/9993596
[13:32:22] <joannac> rosenrot87: paste your connection code. paste the output of the telnet command I gave you earlier
[13:32:54] <kaushikdr> hey guys, how to add extra fields on the the metadata of gridfs?
[13:33:26] <rosenrot87> joannac: http://pastie.org/9993597 I will do the telnet now
[13:33:46] <joannac> Waheedi: if that's in the mongo shell, that will not work
[13:34:17] <Waheedi> then if you can trouble your self and try it you will see that it will work...
[13:34:20] <Waheedi> my version
[13:34:49] <spuz> Hello, is it possible to update a document only if a certain field matches a certain value?
[13:35:06] <joannac> spuz: yes, that's what the query part of the update is for
[13:35:13] <spuz> joannac: oh ok
[13:35:32] <joannac> Waheedi: watch your attitude.
[13:35:58] <Waheedi> i will :)
[13:36:12] <Waheedi> i was
[13:36:15] <Waheedi> lol
[13:36:16] <GothAlice> Waheedi: RTFM and see that even if it works, it's not how you're supposed to do it. The fine manual does not mention anywhere in the documentation for the command that use, thus the ability to use it the way you are is a fluke, unsupported, and subject to change without notice.
[13:36:33] <GothAlice> I.e. stop it. ;)
[13:36:44] <Waheedi> alright :)
[13:36:49] <joannac> Waheedi: line 16 makes no sense
[13:37:05] <joannac> it returns nothing. as expected?
[13:37:14] <joannac> because you removed the document
[13:37:57] <Waheedi> all clear
[13:38:35] <rosenrot87> joannac: http://pastie.org/9993604
[13:39:04] <joannac> rosenrot87: what. the hell.
[13:39:15] <joannac> the connection succeeds in your C# code?
[13:39:35] <GothAlice> rosenrot87: I haven't been fully following your issues, but in your terminal are you able to execute "db.getCollectionNames()"?
[13:39:49] <cheeser> connections aren't created when you "new" a MongoClient.
[13:40:08] <cheeser> at least the java driver (which shares DNA with the .net driver) lazily connections.
[13:40:11] <cheeser> connnects.
[13:40:13] <cheeser> fuck it. :)
[13:40:15] <GothAlice> A connection is one thing. Successfully authenticating is another, too. :)
[13:40:19] <rosenrot87> GothAlice: In shell everything works
[13:40:23] <GothAlice> Cool, cool.
[13:40:38] <cheeser> so even though the new "MongoClient" "succeeds" you're not actually connected until you try to talk to the db
[13:40:39] <joannac> rosenrot87: when I say "I want the output of the telnet command", I want the DAMN OUTPUT
[13:40:44] <rosenrot87> joannac: Yes with the stable version it connects
[13:40:50] <joannac> not your interpretation of the output. the raw output
[13:41:09] <joannac> thanks GothAlice <3
[13:41:17] <rosenrot87> joannac: sorry. How to get the output in windows?
[13:41:40] <joannac> I dunno, run the telnet and copy+paste from the cmd window
[13:42:21] <GothAlice> rosenrot87: Worst-case, screen-shot. Best case, copy/pastebin.
[13:43:13] <joannac> cheeser: but the server.getDatabase should make the connection actually instantiate, right?
[13:43:18] <cheeser> correct
[13:43:27] <rosenrot87> There is no output
[13:43:35] <rosenrot87> the window just keeps blank
[13:43:36] <joannac> rosenrot87: none at all? blank screen?
[13:43:55] <joannac> that means there's no actual connection
[13:44:02] <GothAlice> All hail the mighty firewall?
[13:44:08] <joannac> so the question is, how come the mongo shell works?
[13:44:22] <rosenrot87> telnet server port -f output.txt should write everything to file.....file is just empty
[13:45:10] <rosenrot87> joannac: If i hit enter several times...I come back to the cmd again
[13:48:50] <joannac> rosenrot87: can you add the output of db.isMaster() in the mongo shell?
[13:49:13] <amitprakash> Hi, I have a collection package reflecting manifested shipments ( assigned waybills w/o information such as consignee/address etc )
[13:49:46] <amitprakash> When we get the actual shipment information, we go and individually update each document in the collection
[13:50:03] <amitprakash> However, would it be possible to bulk update these ?
[13:50:23] <amitprakash> i.e. when _id = blah, update with this and when _id = blahblha update with this_ and so on
[13:50:39] <GothAlice> amitprakash: Technically, yes, there do exist bulk operations. However they don't operate the way you're wanting, really.
[13:50:54] <rosenrot87> joannac: http://pastie.org/9993631
[13:51:00] <GothAlice> amitprakash: See: http://docs.mongodb.org/manual/core/bulk-write-operations/
[13:51:02] <joannac> isn't that just 2 updates with multi:true?
[13:51:27] <joannac> rosenrot87: erm, try again
[13:51:45] <joannac> that's your mongo shell's connection to your mongod dying and getting re-established
[13:52:24] <rosenrot87> joannac: http://pastie.org/9993635
[13:53:19] <GothAlice> rosenrot87: Does the bare name "phoebe" DNS resolve?
[13:53:33] <rosenrot87> joannac: should the host be contain the ip address?
[13:53:52] <rosenrot87> GothAlice: I do not think so
[13:55:04] <GothAlice> rosenrot87: Make sure all DNS names, both fully qualified and bare, resolve, even if you have to add them to /etc/hosts (%SYSTEM32%/drivers/etc/hosts on Windows, AFIK).
[13:55:16] <amitprakash> GothAlice, would it be advisable to go with bulk updates as opposed to updating one by one?
[13:55:32] <amitprakash> GothAlice, we'd be using unorderedbulkOps
[13:55:40] <rosenrot87> joannac: Could it be the problem that the host and primary entry "phoebe:27017" can not be resolved?
[13:56:21] <rosenrot87> GothAlice: Could I replace the "phoebe" with the plain IP address of the server?
[13:56:35] <rosenrot87> GothAlice: If so, how?
[13:56:55] <GothAlice> amitprakash: It can be worth it, but to truly answer the question would require writing it both ways and comparing performance.
[13:57:14] <rosenrot87> GothAlice: just tried ping phoebe on the server, it works
[13:57:16] <GothAlice> rosenrot87: Are those IPs going to ever change?
[13:57:19] <amitprakash> GothAlice, we've already done the one by one approach and its already taking a lot of time
[13:57:42] <rosenrot87> GothAlice: NO it is a fixed IP from a rented server
[13:58:03] <GothAlice> amitprakash: Perfect, then you have a point of comparison. Optimization without measurement is by definition premature. :)
[13:58:38] <amitprakash> GothAlice, aight :) thanks
[13:59:41] <joannac> rosenrot87: can you make the hostname "phoebe" resolve from the machine where you're testing the c# driver and try again?
[14:00:37] <rosenrot87> joannac: no
[14:01:26] <GothAlice> I'd refer to that as "gee gee".
[14:03:52] <rosenrot87> I try
[14:05:10] <rosenrot87> That was the solution!!!!!
[14:05:18] <rosenrot87> It works now
[14:05:36] <rosenrot87> Ok it gives me a query error right now.....one seconf
[14:07:13] <rosenrot87> joannac: works :)
[14:08:04] <rosenrot87> Thank you very much! I'm happy right now
[14:08:45] <rosenrot87> GothAlice: How to replace the phoebe entries with the actual IP address?
[14:09:33] <GothAlice> Two points: first, I don't know. Second, it's not something I'd ever do. DNS allows for reconfiguration, potentially without downtime or reconfiguration steps (other than adjusting DNS resolution), so this is a great feature, not a bug.
[14:10:52] <arussel> should I expect my driver (reactivemongo) to use a snapshot when doing a query ?
[14:13:22] <GothAlice> arussel: See the "snapshot" method reference on http://reactivemongo.org/releases/0.10/api/index.html#reactivemongo.api.collections.GenericQueryBuilder
[14:13:36] <GothAlice> arussel: Out of the box, likely not. No driver I know of snapshots by default.
[14:14:59] <GothAlice> (A QueryBuilder instance is what is returned by collection.find().)
[14:23:34] <arussel> GothAlice: thanks, not easy to google snapshot as we get all the result about version SNAPHSOT
[14:28:24] <menger> hey guys, how do i install mongodb-2.2.4 via yum?
[14:29:00] <Derick> you really don't want to stall an old version like that
[14:29:03] <GothAlice> :|
[14:29:09] <Derick> and I don't think we have those packages around anymore
[14:29:47] <menger> i have to (client says so meh) use that version
[14:30:05] <menger> and its in the repo: http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/RPMS/
[14:30:09] <GothAlice> Clients can be wrong.
[14:30:35] <Derick> at least use 2.2.7
[14:30:36] <GothAlice> The trick, I find, is in the balance between evaluating what the client wants, and giving them what they need.
[14:30:42] <GothAlice> http://docs.mongodb.org/v2.2/tutorial/install-mongodb-on-red-hat-centos-or-fedora-linux/ < these are the docs from 2.2, no guarantee it'll work.
[14:30:47] <Derick> but, really, you want the latest 2.6
[14:30:53] <GothAlice> Indeed.
[14:31:00] <GothAlice> The suggestion level on that is over nine thousand.
[14:31:17] <menger> hahaha thank you guys for your quick help :)
[14:31:25] <menger> i hope i can convince my client :D
[14:32:13] <Derick> good luck
[14:32:51] <menger> whats the reason i should at least use 2.2.7?
[14:32:58] <GothAlice> Bug fixes.
[14:33:38] <menger> but is it still compatible?
[14:33:50] <GothAlice> Very much so.
[14:34:12] <menger> got that thank you :3
[14:34:46] <GothAlice> menger: https://jira.mongodb.org/browse/SERVER/fixforversion/12313 < browse forward from here (navigation in upper right) to see what has improved over time up until 2.2.7.
[14:35:27] <GothAlice> ("View issues" at the bottom for the full list for any release.)
[14:36:00] <GothAlice> (Or "Release Notes" button in top right for a breakdown by category for any release.)
[15:05:39] <Cygn> Anyone ever used the jenssegers laravel plugin for mongodb and figured out how to use map/reduce there?
[15:37:55] <calmbird> hi, could you help me: http://gyazo.com/a6d611265928b046c91f41edacb81877 , mongoose: how could I deal with category = All in my case?
[15:38:31] <calmbird> without changing to much given code
[15:39:14] <GothAlice> calmbird: Progressive query refinement. Basically, store the intermediate result of Idea.where('author').equals(author) somewhere, then only conditionally restrict on category through re-assignment of the query variable you are building up.
[15:39:44] <GothAlice> Wish that was a text gist instead of an image…
[15:41:09] <GothAlice> calmbird: https://gist.github.com/amcgregor/45bdebdd2bdf1c46017c
[15:41:32] <calmbird> i see
[15:41:52] <calmbird> yeah thats great
[15:41:55] <calmbird> hank you very much!
[15:41:57] <GothAlice> :)
[15:41:59] <calmbird> thank
[15:43:07] <GothAlice> Insert obligatory rallying cry of "MONGOOSE!" and fist, raised to the sky. That is some ugly, ugly stateful syntax. ^_^;
[15:43:17] <calmbird> so simple and working, forgot I can make reference
[15:44:05] <calmbird> :D
[15:45:23] <GothAlice> Collection.where('field').equals(value) — How is this "easier" than Collection.find({field: value})? What the hojek does .where() even return? Can you save _that_ for later? (I wouldn't try it…) I love libraries that give more questions than answers. XD
[15:46:19] <calmbird> GothAlice: I don't know actualy
[15:47:11] <calmbird> Just listening to mongouniversity, but starting to hate mongoose lately
[15:47:33] <GothAlice> The majority of support cases I deal with in here are mongoose-related injuries.
[15:47:34] <d0x> To make our "operational" data (collected over the day) available in a dashboard, I used a daily Mongodb-MR job is converting it (joining collections). I used MR because i don't like to have another "infrastructure" to do this job. (in sum we have around 400GB). Is it common to use MR for this kind of tasks? Because I have hit already limitations like calling `db.xxx.find(this.xxx)` in the map stage. As workaround i simply loaded the wh
[15:47:34] <d0x> ole xxx database into the scope... Which is working because joined collections aren't huge.
[15:48:46] <calmbird> Oh I can't find clear answer in the internet. Can we start mongod with some parameter, that will ensure data will be writen to disk, before answering?
[15:49:23] <calmbird> Because in standard, mongodb is saying ok, and then writing data later right?
[15:49:33] <GothAlice> d0x: We took a different approach for our event analytics. We pre-aggregate the relevant statistics, allowing us to produce whole dashboards a la http://cl.ly/image/2W0a2D3I370F that generate, from live data, in less than 100ms.
[15:49:49] <calmbird> prety bad way for important data
[15:49:54] <GothAlice> d0x: See: http://www.devsmash.com/blog/mongodb-ad-hoc-analytics-aggregation-framework
[15:50:43] <GothAlice> calmbird: There are many different options you can choose when telling MongoDB to read and write data. What you're looking for is the ability to set a default "write concern" for your application connections.
[15:50:55] <GothAlice> calmbird: See: http://docs.mongodb.org/manual/core/write-concern/
[15:51:40] <calmbird> ok thanx
[15:52:00] <calmbird> I just want to be sure, that mongodb will write data to disk before answering ok.
[15:52:45] <GothAlice> calmbird: By default these days, MongoDB will wait for confirmation of receipt by the server. If you want disk-persistence guarantees, enable "journalling" in your write concern. If you want to avoid a single-host failure nuking the data, you can specify how many hosts you want the data to appear on before the write is considered complete.
[15:54:50] <calmbird> I'v seen video https://www.youtube.com/watch?v=JWaDa8taiIQ, that guy is saying mongodb is bad for criticall data, because you are not sure, data will be writen to disk, but mongo will always answer yes. But I'm guessing that it has been fixed for now.
[15:55:01] <GothAlice> That's been fixed for a while.
[15:55:10] <GothAlice> It's now just a common misconception.
[15:55:35] <GothAlice> calmbird: As a fun note, I actually _lower_ the write concern for many of my inserts. Centrally aggregated logging records need to be fast, it's OK if a few get lost during high load. (There are machine local files which contain any missing records.)
[15:56:43] <krisfremen> funny how people believe everything the read on the internet without doing any testing for themselves
[15:57:37] <calmbird> krisfremen: Lack of time, and lazynes
[15:58:43] <calmbird> I can just use mature database like mysql , postrI can just use mature database like mysql , postgrees etc, that everyone using. Or get lucky shoot with mongodb, and then cry, that 10k Euro transfer wasnt writen in DB. :)
[15:59:02] <GothAlice> calmbird: I have 26 TiB of data in MongoDB, and I've had it in MongoDB for 8 years…
[15:59:14] <calmbird> wow
[15:59:17] <calmbird> :P
[15:59:31] <Derick> GothAlice: you started at which version? 0.8, 1.0 ?
[15:59:43] <jeho3> for 8 years?
[15:59:56] <krisfremen> mongodb has been around for 8 years already? damn, time flies
[16:00:03] <jeho3> mongoldb just crawled out of the vagina of Eliot
[16:00:08] <krisfremen> lol
[16:00:11] <Derick> jeho3: that's inappropriate
[16:00:18] <jeho3> but right
[16:00:20] <GothAlice> …
[16:00:54] <jeho3> MongoDB is software from pussies for pussies
[16:00:56] <Derick> GothAlice: wikipedia says our first release was in 2009 though :)
[16:01:22] <GothAlice> Okay, 6. My "nearest even number" rounding failed in this instance. calmbird: I migrated my data out of MySQL as soon as a viable alternative was presented. MongoDB fit the bill for my requirements, even back then.
[16:01:27] <Derick> GothAlice: :-)
[16:01:45] <jeho4> hey, derick, inbred scum
[16:01:52] <krisfremen> still, 6 years..
[16:01:56] <jeho4> mongodb is for assholes
[16:01:56] <GothAlice> jeho4: Oh, it's you again.
[16:02:03] <calmbird> GothAlice: : I love mongodb,all I was worying about was data safety
[16:02:07] <GothAlice> Though I did have to add full text indexing and compression at the application layer.
[16:02:09] <jeho4> ah, u use docker and mongoldb
[16:02:22] <Derick> i can play this game for a while...
[16:02:45] <GothAlice> calmbird: If you've ever had a failure in MySQL that required you to reverse engineer the on-disk InnoDB table structure… BSON is a joy to work with by comparison.
[16:03:01] <Derick> GothAlice: yeah - at least one document stays together
[16:03:55] <GothAlice> Derick: It was a bit of a shock that even with directoryPerDB enabled, InnoDB stored critical table structural information in a common pool a directory up… quite the surprise indeed.
[16:04:18] <calmbird> GothAlice: : In mongodb we need to join data on server side, in mysql we can join data in database side, is it any issue of mongodb? is it slower etc?
[16:04:47] <GothAlice> calmbird: In MongoDB you don't design your data models as if they were relational, so in general the difference is a non-issue.
[16:05:12] <GothAlice> There are other ways of storing related data that make more sense when you have document storage, for example, I store all of the replies to a forum thread within that thread's document.
[16:05:39] <GothAlice> Need to move an entire thread to a different forum? $set one value in one document and you're done.
[16:05:42] <medmr> calmbird: if you have highly relational data
[16:05:49] <medmr> you may want to step back and reconsider mongodb
[16:05:51] <calmbird> GothAlice: Well in mongouniversity course, they said that for larger data, we should add more collections. So then we need to join some data.
[16:05:53] <medmr> as a data store
[16:05:59] <Derick> GothAlice: what sort of "tricks" did you do to make sure you're not hitting the 16MB doc limit that way? Or, don't you have that many replies?
[16:06:16] <GothAlice> Indeed. Highly relational data, data where you have transactional requirements, or data where you are performing deep graph traversal are all examples where MongoDB might not be the right solution for you.
[16:06:31] <calmbird> GothAlice: I see
[16:07:01] <medmr> but doing multiple calls to get related data
[16:07:06] <medmr> makes sense sometimes with mongo
[16:07:14] <GothAlice> Derick: 16MB doc limit = (16*1024*1024*1024/6.5) ~2.5 million words if the average length is 5.5 and counting spaces.
[16:07:26] <Derick> GothAlice: yes - I know it's quite a lot :-)
[16:07:34] <GothAlice> Derick: But, in general, it's trivial to have a "comment" at the end that links to the continuation thread, then the initial thread is locked.
[16:07:39] <medmr> especially if you are looking up the related data by _id, thats pretty fast
[16:07:50] <Derick> GothAlice: okay - just curious what you did there.
[16:07:50] <Cygn> Hey Guys just a general question, if i have a collection, let's say with customers, each customer did multiple sales, would you say this is something that should like ALWAYS be done with a relational database? Or could somebody for example say i insert the sales as a sub-array inside a customer collection without ruling against any best practice rule?
[16:08:21] <Derick> Cygn: how does your app want to make use of the data?
[16:08:37] <medmr> sales can be an array
[16:09:00] <calmbird> look: http://gyazo.com/9094325c7d11126a3cb65e261200a670, this is from mongouniversity.com course, they are making related data here
[16:09:26] <medmr> there are different approaches each with pros and cons
[16:09:36] <calmbird> I will need to join this data
[16:09:37] <medmr> best choice depends on how you intend to use the data
[16:09:38] <Derick> medmr: I don't think I would have separated that out.
[16:10:12] <Cygn> Derick: I need to fetch sales, by ONLY counting them, using their date and market (would be saved in the subarray). Before i do that, they would be filtered using attributes of the customer. (For Example, i could need all sales of an iPad (article name in the subarray), but only if the customer is from USA(country in the customer data) )
[16:10:13] <calmbird> yeah we have something like 15MB limit, cant remember, if we expect documents to have large amount of data, we should split it to more collections and eventualy join it
[16:10:52] <calmbird> *limit per document
[16:10:56] <Derick> Cygn: i would not create subdocuments in that case then. Have each document store date, market, customer attributes that you need to query on - perhaps
[16:11:02] <GothAlice> http://www.javaworld.com/article/2088406/enterprise-java/how-to-screw-up-your-mongodb-schema-design.html
[16:11:17] <Derick> calmbird: yes - but 16MB is a lot
[16:11:34] <calmbird> Derick: : I understand, but subdocuments are part of document right? and if we will have realy alot subdocuments, we can extend ~15MB limit
[16:11:42] <calmbird> yeah its alot ^^
[16:11:52] <Derick> in the case of blog comments f.e. - if it's something like my tech blog, it's not a problem as I rarely have any comments at all. If you're hackernews, you might run into issues.
[16:11:54] <calmbird> but never say to much, or you can be surprised :D
[16:12:03] <Derick> calmbird: no, 16MB is per top level document
[16:12:04] <Cygn> Derick: But that would mean i should rather use a relational database? Because i was very happy about the speed that mongodb delivers right now, but that was BEFORE i had to filter it by the other filter criterias.
[16:12:15] <Derick> subdocuments, are part of the full document and count towards the 16MB
[16:12:25] <medmr> cygn
[16:12:37] <Derick> Cygn: mongo also handles it just fine - and potentially faster?
[16:12:46] <medmr> is there any way you can fetch related documents from another collection with out first looking up the salesperson
[16:12:55] <Derick> calmbird: you can't increase the 16MB limit.
[16:13:06] <medmr> i.e. look up the salesperson and the sales documents in parallel?
[16:13:13] <medmr> if you have a key that they all include
[16:13:20] <Derick> medmr: store the sales person with each sale, and you don't need to
[16:13:27] <calmbird> i have a game, document user with subdocuments, mapobjects, mapblockers, friends everything, and I started to worry about 16MB limit actualy
[16:13:33] <medmr> no just an id
[16:13:40] <GothAlice> calmbird: Consider that I when I was writing those forums I had to import all of the old forums for a gaming group with 14,000 people. The entire old forums could have fit in a single 16MB document.
[16:13:45] <medmr> no sense storing the whole doc
[16:13:45] <GothAlice> 16MB is a _lot_.
[16:13:48] <Derick> medmr: then you need to do two queries...
[16:13:53] <Cygn> Derick: Actually right now i don't see how to get this without joins… medmr: in every sale there is an attribute which declares the salesperson
[16:14:00] <medmr> GothAlice: you can't just say its a lot and not address the limit
[16:14:17] <GothAlice> medmr: I previously described, for the forums example, a method of having continuations.
[16:14:22] <medmr> yeah
[16:14:31] <calmbird> GothAlice: I see
[16:14:45] <medmr> Derick: two queries is okay, its like doing a join
[16:15:02] <Derick> medmr: yeah, it's like doing a join client side (ie, on the web server)
[16:15:02] <calmbird> Well in subdocument operations, we still miss some functionality in mongodb.
[16:15:06] <Derick> nothing really wrong with that
[16:15:26] <Cygn> medmr: So you would just fetch all sales that fit and all customers that fits and then iterate through it and see which fits together?
[16:15:33] <medmr> if the queries can be done in parallel its not significantly slower than looking up one monolith document
[16:15:53] <medmr> if i had a /salesperson/00001/ page
[16:16:07] <medmr> i would fetch the salesperson using id 00001
[16:16:11] <medmr> and the sales using id 00001
[16:16:13] <medmr> in parallel
[16:16:34] <calmbird> something like db.something.update({$set: {array1.variableA: 'someValue'} }
[16:16:36] <medmr> with an index on sales for .salespersonid
[16:17:09] <Cygn> The Problem is this could result in quite a high amount of data… let's say a few million sales and a few million customers, and i would have to fetch nearly 90% just to sort it out… does not feel like the most clever way :/
[16:17:17] <calmbird> it wont set variableA to alll objects in subdocument, only to first
[16:17:27] <medmr> no you wouldnt have to fetch a lot to sort it out
[16:17:35] <medmr> you can adjust your query on sales collection as needed
[16:17:38] <medmr> to limit or sort or w/e
[16:18:13] <Cygn> medmr: So you would first fetch all customers that are fitting and afterwards ask for all sales that belong to them ?
[16:18:47] <Cygn> medmr: on client/server-side not database, that's what i mean
[16:19:11] <medmr> what is the criteria that you are using to lookup customers
[16:19:12] <Derick> I don't think it's worth doing it in parallel.
[16:19:32] <medmr> it works very well for some situations
[16:19:58] <Cygn> medmr: Every sale has a customer id
[16:20:33] <medmr> and what data are you trying to display from what key?
[16:21:07] <Derick> got to go now - please ping me (with a /msg) if this idiot with insults returns
[16:21:22] <GothAlice> Derick: Will do. He's almost becoming a regular! XD
[16:21:40] <medmr> you're not displaying a few million customers at once
[16:21:42] <medmr> right
[16:22:06] <Cygn> medmr: okay, most of the time, the question will f.e. be "how many sales in june" - i don't need the customer for that, only the sales (data is in the sales). But then there will be questions like "how many sales in june by vip customers" - that's where i will have to start filtering the results by the customer criterias
[16:22:32] <GothAlice> "VIP customers" is, however, a small subset.
[16:22:46] <GothAlice> Thus a client_id: {$in: […]} isn't too egregious.
[16:22:48] <Cygn> medmr: Actually i have to in a way (because i always have to count sales, filtered by customer criteria) but no i don't have to display the customer information by itself.
[16:24:03] <Cygn> GothAlice: That was just one example without digging to deep in the actual topic. The Customers we are talking about here have very much specific criteria (because actually it is not a customer buying but a sales unit, which contains information about series, headunit, country, model etc.)
[16:24:04] <medmr> you could either include flags that make sales searchable by those criteria
[16:24:43] <medmr> i.e. sales { custid: 23459087234587, vipcust: true, blah: blah}
[16:24:49] <medmr> or do the customer lookups first
[16:25:05] <medmr> and get sales custid: {$in: []}
[16:25:16] <GothAlice> Cygn: The trick with MongoDB is that your "schema" isn't sacrosanct, it should instead adapt to meet your query needs. Pre-aggregation (i.e. including "vip" in the sale) isn't data duplication, it's making your data queryable.
[16:26:06] <Cygn> GothAlice: Yeah i'm kind of getting this idea right now, but that would mean i would have to update every single sale of the customer when the customer changes, right?
[16:26:31] <GothAlice> Potentially, but again, an index on cust_id will make that type of update fast.
[16:26:38] <medmr> this is the problem you get into sometimes
[16:26:42] <Cygn> medmr: by flags you mean just insert the customer attributes into the data, because i just googled "mongodb flags" ;)
[16:26:50] <medmr> yes Cygn
[16:26:56] <medmr> sorry for my use of "reserved word"
[16:26:58] <medmr> :)
[16:27:26] <medmr> just meant in generic sense
[16:27:35] <medmr> boolean attribute
[16:28:43] <Cygn> GothAlice, medmr: Just one last backquestion. If i would like to insert the data. My structrue should be Customer1 { id: 1, sales: { sale1, sale2, sale3 }}, Customer2{id: 2, sales {sale1, sale2, sale3 }}, and an index on the customer id, correct?
[16:29:32] <GothAlice> I personally wouldn't store sales embedded under customers.
[16:29:46] <GothAlice> In fact, at work, invoices are their own collection.
[16:30:00] <cheeser> it's the only reasonable choice
[16:30:31] <Cygn> GothAlice: Okay, but (sorry for asking again) then i don't get how i should filter the sales using the customer data, without fetching the customer first in my client/server-side code?
[16:30:44] <GothAlice> Indeed. We also pre-aggregate the related data and embed it all, since our invoices need to be involatile.
[16:31:13] <Cygn> GothAlice: But by pre-aggregation you mean aggregate on database-side or on server-side?
[16:31:34] <GothAlice> Cygn: Pre-aggregation means doing it at the time you insert the record.
[16:31:39] <GothAlice> Cygn: Question: if a customer changes their billing address, do you want old, archived orders to change to reflect the new address?
[16:32:04] <GothAlice> (The only correct answer is: no. Archived financial information should not change.)
[16:33:06] <GothAlice> As such, we clone pretty much every detail about the user placing an order, and the company the order is being billed to, within our invoices. (All that make sense to store, at least. Login history from the user doesn't need to be there. ;)
[16:33:36] <GothAlice> Querying our invoices for company or user-related information is then easy; no join, since the data is "pre-joined".
[16:34:18] <Cygn> GothAlice: Sure, actually in our case we are storing the sales only for statistic use. But anyway, if i get you right, you would aggregate the two collections in the moment you would query for it, like creating a View in the classic sql world?
[16:34:56] <GothAlice> Pretty much the exact opposite of what you just described.
[16:36:12] <Cygn> GothAlice: You would insert all the customer data in the sale just when you save the sale? But didn't you say invoices are their own collections? Sorry, i really try to get it.
[16:36:42] <GothAlice> User = {name: "Bob Dole", email: "bdole@whitehouse.gov"} Company = {name: "Some Company", vip: true} Invoice = {creator: {name: "Bob Dole", email: …}, company: {name: "Some Company", vip=true, …}, line_items: […], …}
[16:37:08] <GothAlice> Indeed. If a user changes their e-mail address, or a company's VIP status changes, in our case we still want those archived orders to say "yeah, this order was a VIP order".
[16:37:48] <GothAlice> Real joins, or fake joins, would actually give us the wrong answers. :)
[16:38:21] <Cygn> GothAlice: I won't have this situation (since our "customers" are sales units for statistical data, if they change their status, it should be changed for older sales also), but anyway i could think of doing it that way anyway.
[16:39:22] <Cygn> Now i just have to find out how to fetch for subarrays of an entity using a condition for only specific subarrays ;)
[16:39:35] <mikeputnam> can someone share or point me to pragmatic methods to handle data migrations/deployments as part of a workflow? devs create code + data/schema changes => i orchestrate those changes into deployments that get applied to staging, production
[16:40:26] <mikeputnam> in the mysql world, i've done this with capistrano and incrementally numbered .sql files that get applied sequentially.
[16:40:51] <StephenLynx> with mongo you don't have a schema.
[16:41:00] <StephenLynx> but
[16:41:23] <mikeputnam> i realize this. but that is a symantic argument. the concept of a schema still exists.
[16:41:40] <StephenLynx> one can write code that perform queries that rename fields and adapt data to a different logic.
[16:41:53] <Cygn> GothAlice, medmr: thank you VERY much !
[16:42:43] <StephenLynx> when I am writing code I always assume things may not be there and write defensively around that.
[16:42:58] <StephenLynx> so if something changes it won't break anything
[16:44:09] <StephenLynx> so unless you are working with a tool that implements a pseudo-schema, like mongoose, the developer will have to develop it from scratch.
[16:46:56] <mikeputnam> i see. that makes sense. from my perspective, i will likely be adding a "db_changes" step to my deployment process that just runs whatever the devs come up with.
[16:47:19] <mikeputnam> StephenLynx: thank you
[16:47:35] <StephenLynx> np
[17:04:16] <Cygn> StephenLynx: Do you have an url with a best practice example for how to query for items using attributes of the subarrays for me?
[17:04:35] <StephenLynx> nope.
[17:05:01] <StephenLynx> but I can link you a project I did using mongo and try to learn the pattern I work under.
[17:05:21] <Cygn> StephenLynx: Sure, why not :) Thanks !
[17:07:14] <StephenLynx> I try to stick to basic rules: don't make multiple queries for one thing, don't avoid sub arrays too much
[17:07:34] <StephenLynx> https://gitlab.com/mrseth/bck_lynxhub Cygn
[17:08:02] <GothAlice> mikeputnam: https://gist.github.com/amcgregor/dda39062f8b4f74f5e1a < my method of doing "migrations"
[17:08:03] <StephenLynx> https://gitlab.com/mrseth/bck_lynxhub/blob/master/doc/model.txt my schema
[17:08:19] <Cygn> StephenLynx: Just a general question before i start digging in: Do you use elemMatch for the Subarrays?
[17:08:30] <StephenLynx> often.
[17:08:51] <StephenLynx> specially when I just need to know if the array contains a specific element
[17:08:55] <StephenLynx> so I don't have to iterate over it.
[17:09:15] <StephenLynx> but not always, though. sometimes I want the full array.
[17:09:32] <GothAlice> mikeputnam: The runner searches a given Python package namespace for scripts with main() functions (sorta, it's a bit indiscriminate at the moment) and runs them, using docstrings to output progress. The migrations themselves are written so that multiple execution is safe, i.e. they check preconditions.
[17:13:47] <Cygn> StephenLynx: in my case it could be really useful since i never need the full data, i only need to count the subarrays using one of their criteria
[17:18:39] <mikeputnam> GothAlice: thanks! this confirms the direction StephenLynx suggests. generic runner runs specific developer-created code that changes mongo appropriately.
[17:19:34] <mikeputnam> in the past i've come upon suggestions that documents should have a version number to enable rollbacks. any thoughts on this?
[17:20:28] <GothAlice> Well, that alone doesn't enable rollbacks, but consider: why does one really need to move backwards?
[17:22:04] <d0x> GothAlice: Thx for the references. Our application is using serval collections which are designed for the daily business. And to "pre-aggregate" them for the Dashboard i utilised MR (because the Aggegration Framework had some limitations for us (like missing string functions, accessing data from other collections, ...)). After the MR is executed i use the aggregation framework as well to get fast responses.
[17:22:04] <d0x> How would you "pre-aggregate", "pre-join", ... the data? I thought the best is to utilise mongoldb as well. And the only option was for me is MapReduce. As alternative i could do smth. like db.xxx.find().forEach(...). But that doesn't scale.
[17:23:14] <mikeputnam> my perspective is that of a system administrator. i'm concerned about late-at-night code+dat migrations that fail due to unforseen data condtions (or whatever) and i'm left holding the bag. as a non-developer with 15 developers generating change, this has me concerned about the stability of production.
[17:24:11] <mikeputnam> (and my own ability to recover from that sort of scenario)
[17:25:53] <mikeputnam> we are moving toward an always up/no downtime deployment model (blue/green) but this introduces more concerns for the developers writing the db changes correctly. -- at least in this model, i can recover by just reinstating the previous version.
[17:26:07] <GothAlice> mikeputnam: Standardize your deployment procedures. We have a random YouTube music video go up on the projector when the automation gets triggered, and the developer deploying must wear a silly hat.
[17:26:33] <GothAlice> (These two things generally attract attention, which means oversight of whatever that developer is doing in production.)
[17:27:11] <mikeputnam> lol
[17:27:12] <mikeputnam> nice
[17:28:07] <GothAlice> For example, we also deploy first to a clone of production called 'staging'. And only if everything goes well there do we repeat the automation on the real production environment.
[17:28:17] <mikeputnam> that much we do.
[17:28:49] <daidoji> hello everyone
[17:29:07] <daidoji> if inserting with continue_on_error=True, is there a way to get all the exceptions?
[17:29:17] <daidoji> or is it like the documentation says, I only get the last error that occurred in the batch?
[17:29:31] <GothAlice> I suspect the documentation would be correct in this instance.
[17:30:36] <daidoji> hmm, then thats kind of pointless for my use case then. Guess its better to use the old style bulk_import functions if I need all the errors?
[17:31:15] <GothAlice> Indeed. You'll get reduced throughput (inserts/sec) by querying for the result of each, but if that's what you need, that's what you need.
[17:33:38] <daidoji> hmm, okay
[17:34:50] <daidoji> also out of curiosity, what's the fastest way to load data in your experience? python -> pymongo or python parse -> stdout -> mongoimport
[17:35:32] <GothAlice> I've never really bothered to benchmark bulk loads. I do the smallest number of hops, so pymongo direct.
[17:35:34] <daidoji> I ask because we pipe-delimit here like all god-fearing people and mongoimport doesn't support choice of delimiters for some reason :-(
[17:35:38] <daidoji> ahhh
[17:35:51] <daidoji> is there a place we can vote on feature requests for mongoimport?
[17:36:08] <GothAlice> If you "delimit" at all, there's something wrong, 'cause MongoDB does deeply structured records, not flat ones. ;^)
[17:36:38] <daidoji> awww, I know that and if I were in charge things would be a lot different, but I jump when they say jump at the moment :)
[17:37:05] <daidoji> currently, I'm about bulk-loading large datasets that change constantly which is why I ask
[17:37:08] <GothAlice> Hint: YAML, of which JSON is a sub-set. (YAML also has a natural way to express multiple records per file.)
[17:37:52] <daidoji> unfortunately these formats are out of my control and are pipe delimited csvs :-9
[17:37:55] <daidoji> :-(
[17:37:59] <daidoji> but such is life
[17:38:05] <daidoji> anyways, thanks for the input
[19:03:41] <keeger> hello
[19:04:04] <keeger> i am trying to design a database for a program in mongo, and i have a question about document arrays
[19:04:19] <keeger> i would like to store an array of sub-documents sorted by timestamp inside a document
[19:05:01] <keeger> is there something in mongo to help with that, or do i have to code it myself
[19:07:11] <StephenLynx> AFAIK, it will keep the order you insert the elements in the array.
[19:07:33] <StephenLynx> but you can, from time to time, use an aggregate and update the array with the result of the array.
[19:07:49] <StephenLynx> or you can unwind and sort on reading.
[19:08:05] <keeger> is unwind = sql select?
[19:08:10] <GothAlice> Well, no.
[19:08:18] <StephenLynx> the downside with this option ,though, is that if the array is empty, you will get nothing, if I'm not mistaken.
[19:08:21] <GothAlice> It's a more… verbose… way to query your data vs. normal find().
[19:08:42] <GothAlice> keeger: For the most part attempting to keep elements within a nested array in any order other than insertion order (i.e. always append or always prepend) is difficult.
[19:08:51] <StephenLynx> unwind splits the array into a series of documents with an element of the array each.
[19:09:38] <keeger> can i do the equivalent of a select order by?
[19:09:48] <GothAlice> Of a sub-array, not normally, no.
[19:09:58] <StephenLynx> because sort will only work for documents.
[19:10:06] <StephenLynx> thats why unwind exists.
[19:10:17] <StephenLynx> so you can split the array, sort and regroup it after.
[19:10:22] <keeger> i see
[19:10:24] <GothAlice> Aggregate queries have their own restrictions, though.
[19:10:24] <keeger> is that expensive?
[19:10:32] <StephenLynx> I heard it is.
[19:10:39] <StephenLynx> don't rely on me though.
[19:10:48] <keeger> might be faster for me to just grab the whole array and sort it in my app code
[19:10:56] <StephenLynx> I don't think so.
[19:11:08] <GothAlice> keeger: Regardless of how you do it (aggregate unwind, filter/sort, group, or within your application code) it's expensive.
[19:11:25] <GothAlice> In one way (aggregate) the data manipulation happens closer to the data set.
[19:12:02] <GothAlice> See also: http://docs.mongodb.org/manual/core/aggregation-pipeline-limits/
[19:13:49] <keeger> dont think i'll hit the size limits
[19:14:14] <keeger> i believe the array size will be < 100 elements for the most part
[19:14:23] <keeger> and the sub documents are small
[19:16:31] <GothAlice> In that instance, I'd just sort them at the display layer. I.e. in your application code.
[19:17:18] <keeger> sounds like it. then when i delete one, i can just delete it by index in the array
[19:18:11] <keeger> i'm also struggling a little conceptualizing horiznotal scaling on webservers and mongo
[19:18:48] <keeger> if i have 2 web servers, and 2 people attempt to write to the same document, does it block the 2nd one?
[19:19:43] <GothAlice> keeger: It… can?
[19:19:54] <GothAlice> Depends on how you use it.
[19:20:53] <mordonez> hi guys
[19:20:57] <keeger> i am thinking of putting in a simple lock system, because i need to loop through this array and update other parts of the same document. but it could take 100 ms who knows
[19:20:58] <mordonez> I want to add a replica for my db
[19:21:20] <mordonez> I have to run rs.initiate() on the replica?
[19:21:43] <keeger> so i was thinking of putting a document.LockId entry, and then the clients would read, look for lock, if missing lock it with a unique Id, read it back and confirm the Id matches, and it's theirs
[19:22:03] <keeger> then when done processing, clear the lock
[19:22:08] <GothAlice> keeger: MongoDB doesn't ever really do two things to the same data at once. Instead, MongoDB reduces operations down to individual atomic operations. So operations A and B on the same data will either resolve A, B, or B, A.
[19:22:32] <GothAlice> mordonez: See: http://docs.mongodb.org/manual/tutorial/expand-replica-set/
[19:22:52] <GothAlice> mordonez: Or, if you don't already have a replica set, see: http://docs.mongodb.org/manual/tutorial/convert-standalone-to-replica-set/
[19:23:22] <mordonez> thanks GothAlice
[19:24:32] <GothAlice> keeger: Update operations (A and B in my continuing example) may have conditions that need to be true before the update can be applied. This would be the query that is passed as the first argument to .update() (normally). You can check for ownership of a lock and if nUpdated=0 (in the returned result) the condition failed.
[19:25:14] <keeger> GothAlice, ah, that would work nicely
[19:25:40] <keeger> i think that's more straightforward and lightweight than trying to a network level semaphore
[19:26:57] <GothAlice> keeger: As a note, so far I haven't ever required a dedicated field for locking except for my distributed task runner, which locks based on current worker. Simple field updates and update-if-not-modified (conditional) updates are more than sufficient for most cases.
[19:27:47] <keeger> i agree, for the majority of my app that should be fine
[19:28:14] <keeger> but i do have one process that i don't know how long it will run, and it can only run once. i'd like to lock it at the db level cuz of the horizontal ness
[19:30:48] <MLM> How do I find documents that have a certain string in one of their array of string fields? Mongoose `Model.find({ usersIds: req.user._id })` isn't pulling up any results (confirmed that they are in the collection)
[19:35:20] <keeger> GothAlice, thx for the help
[19:55:25] <sivli> hi all, anyone on?
[19:55:35] <GothAlice> Nobody but us turkeys.
[19:55:57] <StephenLynx> :v
[19:56:11] <StephenLynx> i get the c&c reference, but not the turkey one
[19:56:13] <sivli> lol, just poping in to ask about topojson vs geojson.
[19:56:28] <StephenLynx> never heard about either
[19:57:29] <sivli> Well I am rather new to geo-location data but as I understand it mongo as good geojson support?
[19:59:28] <sivli> Was wondering if anyone use the geospatial indexes and if they know if topojson is supported?
[19:59:57] <kali> nope
[20:00:07] <kali> only a subset of geojson is supported actually
[20:00:15] <sivli> Sads :(
[20:01:00] <sivli> We are not doing anything to crazy right now, just points, but there are plans to expand later and topo looks much more efficent.
[20:01:17] <xissburg> Hmm
[20:01:28] <kali> i'm not sure you'll find good support for it in any general purpose database, tbh
[20:01:32] <sivli> Any idea if this is being considered or is it jsut too early in the geo game to know?
[20:01:34] <kali> or any support at all
[20:01:50] <sivli> Thus one of the selling points to mongodb :)
[20:02:20] <sivli> (and the fact that meteor js locks us to it, but I am not complaining)
[20:02:22] <kali> well, the geojson it supports is usefull
[20:02:40] <kali> and the implementation is reasonably efficient
[20:03:21] <sivli> Agreed. Ok wel; thanks, I just wanted to be sure before I told the boss man no can do for topo. Not worth losing the mongodb support.
[20:19:23] <MLM> Running `db.queues.find({userIds: '54effc576676066c20293010'})` returns some results in the mongo shell. Running the same query with Mongoose `Queue.find({ usersIds: '54effc576676066c20293010' })` gives no results. What is the best way to debug this?
[20:20:06] <GothAlice> MLM: It's actually a critically important distinction: are you storing the hex-encoded string version of the ObjectIds, or are you storing actual ObjectIds?
[20:20:20] <MLM> String version _id
[20:21:13] <MLM> When it is created: `queue.userIds = [req.user._id];`
[20:21:15] <GothAlice> Less cool, but consistency is key. You can't mix the two and get sane results back. It's less cool because you're wasting 12 bytes for every single reference stored, but…
[20:21:22] <GothAlice> Ah, then that's an ObjectId.
[20:21:32] <GothAlice> So you're mixing, and your ability to query that field goes out the window. :(
[20:21:37] <MLM> Oh, I thought that was `id`
[20:21:48] <GothAlice> _id is the field name for the "automatic" ID.
[20:22:32] <GothAlice> MLM: http://cl.ly/image/2s1G3T0E0e22
[20:23:33] <MLM> I will convert over to full string representation consistency but why does it return results in the Mongo shell
[20:23:53] <MLM> (in my case at least)
[20:24:10] <GothAlice> That's a very good question. Likely the ODM-side of things recognizes that that field is an ObjectId reference and is attempting to automatically cast the (valid) string-version you are supplying.
[20:24:16] <GothAlice> Please: standardize on real ObjectIds.
[20:24:34] <GothAlice> Real ObjectIds store some interesting data that is useful to be able to access. And halving the storage space is nice.
[20:24:56] <GothAlice> See: http://docs.mongodb.org/manual/reference/object-id/
[20:25:28] <MLM> Will do (convert to consisten ObjectId). Need to look up again the quirks with comparing them. Thanks for the help
[20:25:42] <GothAlice> They compare numerically based on time.
[20:25:50] <GothAlice> (Effectively. Time is the first field in the BLOB.)
[20:58:21] <MLM> How do I query for a field with that is an ObjectId?
[20:58:22] <MLM> The mongo shell is returning the documents with this query `db.queues.find({userIds: ObjectId('54effc576676066c20293010')})` but I can't get Mongoose to return it. I tried converting the string to a ObjectId in the query itself even though I have read that it should be auto casted.
[20:58:27] <MLM> `Queue.find({ usersIds: '54effc576676066c20293010' })` or `Queue.find({ usersIds: new require('mongoose').Types.ObjectId('54effc576676066c20293010') })`
[21:02:58] <obeardly> MLM: I'm a noob to Mongo, but I would couldn't you just: ObjectId("yourstringgoeshere").toString()
[21:03:16] <MLM> GothAlice :/ - I know we talked before about native Mongodb driver and I was tempted and tried moving over but I kinda like the enforced Schema system which I would need to find somewhere else or write custom
[21:04:24] <MLM> Maybe it is worth the leap because of these weird issues
[21:10:34] <MLM> Here is a full barebones test snippet that demonstrates the issue if anyone is interested. If it is not my fault then I'll make an issue: http://pastebin.com/enUPT03A
[21:15:47] <GothAlice> MLM: There's a schema, it obviously says "this is an array of ObjectIds", doesn't cast when querying. Looks like a bug, or, a feature, depending on how generous the Mongoose developers want to be.
[21:16:29] <mspro> hello, i think i managed to create an new collection with the name [object Object] when i used the copyTo command, now i can’t drop that collection, anyone an idea?
[21:17:20] <calmbird> hi, can mongodb somehow deal with transactions/object locking?
[21:17:35] <GothAlice> mspro: In the MongoDB shell you can access collections as attributes of the database (db.foo) but also as associative array elements (db['foo']) — this latter approach should let you clean up that mis-named collection.
[21:17:55] <GothAlice> mspro: Also love the combination of your IRC handle and having that particular issue, BTW.
[21:18:00] <MLM> calmbird: http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/
[21:18:08] <calmbird> thanx
[21:18:51] <GothAlice> calmbird: That's the typical approach. Lighter-weight approaches involve "update-if-not-different" mechanisms; two-phase is most useful for coordinating changes to multiple documents, on single documents the simpler approaches will likely be better.
[21:19:07] <MLM> GothAlice: Not totally understanding what you are saying. Is it just the way I am trying to query it or a true bug?
[21:20:16] <GothAlice> MLM: Unfortunately you'd have to ask the Mongoose developers. Whenever Mongoose comes up I'm reminded of a quote which I will modify appropriately: I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a design. I.e. they may consider the behaviour you are running into to be by-design.
[21:20:36] <mspro> thanks GothAlice i will look in to that
[21:22:41] <MLM> Made an issue: https://github.com/LearnBoost/mongoose/issues/2728 - If you see any way to improve the clarity/description of the issue, then I'm intersested
[21:23:33] <GothAlice> MLM: The test case looks thorough and complete for the issue.
[21:28:15] <GothAlice> obeardly: One stores ObjectIds as ObjectIds to save 17 bytes of space (hex encoding doubles the number of raw bytes, plus requires a terminating CString null and leading 4-byte PString length), to maintain numeric comparison capability (i.e. you can query $gt/$lt ranges to filter based on record creation time), as well as maintain the client-side capability to examine the fields of the ObjectId (timestamp, host, PID, sequence number) without
[21:28:16] <GothAlice> requiring additional manual casting.
[21:28:30] <GothAlice> I love it when people disappear mid-typing. XD
[21:30:24] <mspro> GothAlice: i can access my other collections with the [‘foo’] syntax, but when i do a ‘show collections’ i get a list with my collections + on top of that list within square brackets i get the [object Object] that wasn’t there before? It’s like when you want to print an object
[21:30:46] <GothAlice> mspro: db['[object Object]']
[21:32:51] <GothAlice> mspro: http://cl.ly/image/1i2E0c0b0Q3r
[21:33:06] <mspro> GothAlice: your a genius and i still have a lot to learn, so my collection was literaly named [object Object], i did not expect that one :) thanks
[21:33:13] <GothAlice> ^_^
[21:33:49] <GothAlice> mspro: MongoDB would "stringify" any value you try to use as a collection name. "show collections" is less useful to see what's going on than "db.getCollectionNames()" is in this instance.
[21:35:21] <keeger> so if I have a natural key for my document, would it be better to use that over the ObjectId?
[21:35:36] <GothAlice> keeger: If it's unique and can be generated without querying the existing dataset, yes.
[21:35:37] <mspro> yes, i tried that second one but still didn’t get that it was listing the literal name
[21:36:25] <keeger> GothAlice, when you say generated without querying, you mean, it's not like Id = Max(id) + 1?
[21:36:53] <GothAlice> keeger: Indeed. Or, say, something more complex like YYYYWWNNN where NNN is the invoice number for the week WW in year YYYY. (Our invoicing scheme at work.)
[21:37:16] <GothAlice> Any time you need existing data to insert a new record you run into race conditions.
[21:37:24] <keeger> yep
[21:37:46] <GothAlice> (This is why MongoDB uses ObjectId and not auto-increment. ObjectId scales to multiple independent processes quite well.)
[21:38:35] <keeger> yeah, i'm just trying to figure out how i want to handle a data structure
[21:38:56] <keeger> i have a fixed length set of locations
[21:39:06] <GothAlice> (This is something Facebook and Twitter had to learn the hard way… Twitter even rolled an entire separately-scaled software service for the purpose of generating IDs without conflict. ;^)
[21:39:29] <keeger> and 1 person can only "own" a location at a time
[21:39:49] <keeger> i dont want to have to do joins, so i'm thinking the location is inside the person document
[21:40:02] <keeger> person: locations []
[21:40:13] <keeger> and if it changes, i copy the location sub document to the new owner's place
[21:40:25] <GothAlice> Setting ownership in a way that there can be only one victor (first person to claim wins) is easy using update-if-not-modified. db.locations.update({_id: …, owner: null}, {$set: {owner: ObjectId(…)}})
[21:40:37] <keeger> it's not even a race condition
[21:40:55] <keeger> more of a do i store a collection of Locations, and a person holds a reference to it?
[21:40:57] <GothAlice> The record with the given ID won't re-set the owner if one has already been set. (The result, returning nModified, lets you know if the assignment worked, or there was a conflict.)
[21:41:26] <GothAlice> Indeed, that's how I'd code up the initial implementation. Start simple. :)
[21:42:07] <keeger> how does that work for document atomicity
[21:42:15] <GothAlice> A $set is atomic.
[21:42:22] <keeger> if i go, person A, location [1] update
[21:42:29] <keeger> and location[1] is a refence, is that atomic?
[21:43:29] <GothAlice> Updates to a single record will always be resolved in a linear fashion. So two updates, U1 and U2, will never really "conflict" (other than trampling each-other's data), and the "find" part of that update handles the trampling by requiring the field to be empty (null) or the $set won't be applied.
[21:44:25] <GothAlice> If your application layer issues the updates in the same microsecond it may seem semi-random which one will win. But with an update like the above, only one will win. The other will update nothing and be informed of this fact.
[21:44:25] <dscastro> hey guys
[21:44:39] <dscastro> what is that mean: failed with error 57: "The dotted field 'haproxy-1.4/web_proxy' in 'pending_op_groups.0.pending_ops..sub_pub_info.haproxy-1.4/web_proxy' is not valid for storage."
[21:45:00] <GothAlice> dscastro: You aren't allowed to have extra symbols in field names.
[21:45:11] <GothAlice> For example, not much about that field name is valid.
[21:46:06] <dscastro> GothAlice: does it started on 2.6x ?
[21:46:18] <GothAlice> http://docs.mongodb.org/manual/reference/limits/#Restrictions-on-Field-Names
[21:46:36] <dscastro> i just upgraded my database from 2.4 to 2.6
[21:46:38] <GothAlice> For sanity, I limit field names to the regex: [a-z_]+
[21:47:36] <GothAlice> dscastro: No, likely what you were doing wasn't expressly checked or forbidden before, but was still documented as being a no-no. (I.e. it let you get away with it, but you're still a bad person for trying to get away with it. ;)
[21:48:07] <dscastro> humm.. got it
[21:48:36] <GothAlice> dscastro: Someone using MongoDB should follow their language's variable naming convention. For most languages, this means the regex: [a-zA-Z_][a-zA-Z0-9_]*
[21:49:08] <GothAlice> With your field, you can't access it using attribute notation. (foo.haproxy-1.4/web_proxy would be interpreted as foo.haproxy minus 1.4 divided by the contents of the web_proxy variable.)
[21:51:44] <dscastro> GothAlice: its a moped object
[21:52:27] <dscastro> GothAlice: this is a rails model
[21:52:37] <GothAlice> Oh, ruby. That goes some distance in explaining the lack of convention. ;)
[21:53:17] <GothAlice> So, uh, the layer designed to abstract things and make your life easier is, in this case, making your life harder.
[21:53:59] <dscastro> GothAlice: yep
[21:54:18] <mikeputnam> hey! i too hate ruby and prefer python. :) small world
[21:55:13] <GothAlice> Little-appreciated fact: MongoDB, being schemaless, must store the _names_ of every field in each document that uses that field. Your key adds 20 bytes (beyond a single-letter key) per document that uses it. Something to consider. ;) (I use one- or two-character field names and let my ODM abstraction layer do its thing.)
[21:55:58] <GothAlice> dscastro: Moving forward, fixing your situation will require $rename'ing the fields, then trying to upgrade again.
[21:56:27] <GothAlice> dscastro: Don't forget to recreate any indexes as appropriate, too.
[21:56:34] <dscastro> GothAlice: i just trying to figure out why it was working before upgrade
[21:56:43] <dscastro> should i reindex it?
[21:56:46] <GothAlice> dscastro: MongoDB was missing an assert() call.
[21:57:24] <GothAlice> dscastro: Are any of the freakishly-named fields indexed? If so, you're going to _have_ to reindex after re-naming in order to upgrade. Bogus field names would remain bogus, and current versions aren't missing the assert().
[21:57:40] <dscastro> got it
[22:17:17] <dscastro> GothAlice: what about this: /usr/bin/mongod: symbol lookup error: /usr/bin/mongod: undefined symbol: _ZN7pcrecpp2RE4InitEPKcPKNS_10RE_OptionsE
[22:18:26] <GothAlice> dscastro: Hmm, the symbol name is gibberish to me, but that typically means you have MongoDB compiled for a library not present on your system. I.e. you installed the Ubuntu version on Debian, etc. I compile MongoDB directly for my cluster, so that's rarely an issue for me.
[22:20:33] <GothAlice> (Also because I want SSL support, and am too cheap to buy MongoDB Enterprise. ;)
[22:21:13] <mocx> "MongoDB supports no more than 100 levels of nesting for BSON documents."
[22:21:58] <GothAlice> mocx: Indeed. As with the 16MB document size limit, if you're hitting it, you're probably doing something wrong in the design of your schema.
[22:22:30] <mocx> 100 levels as in field: { level2: { level3: { name: "Steve" } } }
[22:22:37] <GothAlice> mocx: Correct. See also: http://www.javaworld.com/article/2088406/enterprise-java/how-to-screw-up-your-mongodb-schema-design.html
[22:22:47] <mocx> how many documents with in array?
[22:22:57] <GothAlice> An array is considered one level.
[22:23:02] <mocx> so infinite?
[22:24:01] <GothAlice> Well, limited by the 16MB document size limit.
[22:24:07] <mocx> :)
[22:24:08] <mocx> thanks!
[22:24:14] <GothAlice> To ballpark BSON storage sizes, reference: http://bsonspec.org/
[22:31:12] <bendyeraus> hello, I’m optimising my queries, and was wondering to achieve a covered query why do I need to supress _id? sure the _id is in the index? how else is the index linked to the documents in the heap?
[22:33:09] <GothAlice> bendyeraus: Specifically, you need all of the returned fields to be stored in the single index that gets used for the query. Unless you're including _id in your "covered" index, you'll need to exclude the field during projection.
[22:34:16] <ParkerJohnston> i have a question about mongo cursor
[22:34:20] <ParkerJohnston> new to mongo
[22:34:34] <bendyeraus> Hi GothAlice, thanks, my question was more to do with if _id is not automatically in the index, how is the index linked to the documents in the heap?
[22:34:38] <GothAlice> ParkerJohnston: Ask, don't ask to ask. :)
[22:35:09] <ParkerJohnston> public static MongoCursor<Document> getAllActiveHistoryByCaseNumber(String caseNumber) {
[22:35:10] <ParkerJohnston> return caseInformation.find(and(eq("casenumber", caseNumber), eq("history.active", true))).iterator();
[22:35:10] <ParkerJohnston> }
[22:35:30] <GothAlice> bendyeraus: Behind-the-scenes is a complex BTree bucketing process. Yes, there is some form of association between an individual index value and the record it came from, but it might not be the ObjectId. (It might be the stripe ID and index into the on-disk stripe of the record… might not be.)
[22:35:31] <ParkerJohnston> this is returning all embed docs in a single item not in a hasNext format so i cant iterate over them
[22:35:34] <cheeser> is that 3.0 code is?
[22:35:39] <ParkerJohnston> yeah
[22:35:42] <cheeser> w00t!
[22:35:44] <GothAlice> bendyeraus: Thus unless you explicitly include _id in the list of fields, don't expect it to be there.
[22:35:52] <cheeser> playing with the betas/RCs?
[22:35:56] <ParkerJohnston> yeah
[22:36:03] <cheeser> awesome
[22:36:07] <ParkerJohnston> figured if i am going to start minds well go all in
[22:36:25] <cheeser> indeed. especially since it's in to tthe RC phase
[22:37:03] <ParkerJohnston> exactly
[22:37:05] <bendyeraus> GothAlice: thanks.
[22:37:19] <ParkerJohnston> just no idea why it is grouping the embedded doc
[22:37:33] <ParkerJohnston> it shows there are two documents but only counts it as one
[22:38:53] <cheeser> "embedded docs" are largely a conceptual thing for humans.
[22:39:24] <cheeser> a collection has documents and when you fetch that document you get everything inside it regardless of whether it could be considered an embedded document or not.
[22:39:26] <ParkerJohnston> but they are stored as an array why cant i get them back out the same way
[22:40:04] <cheeser> can you pastebin what you're betting back and how it doesn't match expectations?
[22:43:55] <bendyeraus> While I’m here one other question I was thinking about, for optimal count queires, should I be doing a find query, projecting the fields I know will cause a covered query and then calling count, ie. find({x:123}, {x:1,_id:0}).count(). or is count smart enough to only access the index ie. is count({x:123}) enough?
[22:44:03] <ParkerJohnston> http://pastebin.com/Wm0xE0j6
[22:48:33] <cheeser> ParkerJohnston: and what's wrong with that document?
[22:48:45] <cheeser> history is an array of documents...
[22:49:13] <ParkerJohnston> what should it be?
[22:49:32] <cheeser> i dunno. why shouldn't it be an array?
[22:49:49] <ParkerJohnston> oh misread your statement
[22:50:40] <ParkerJohnston> the issue is that when i get the results and try to use history.next() to print out the results it does not print ot two lines it is one large cluster
[22:51:40] <cheeser> what two lines? you have one document there.
[22:51:50] <cheeser> are you trying to iterate the history?
[22:52:42] <ParkerJohnston> correct
[22:53:05] <cheeser> you're getting the document from the collection. the history array would be nested inside that document.
[22:53:06] <ParkerJohnston> to loop over it and dispaly it in a table
[22:53:34] <ParkerJohnston> correct..i guess a question comes how do i get into that array then?
[22:54:05] <cheeser> document.get("history") would return a List<Document>
[22:54:20] <cheeser> Document is just a Map, basically
[22:55:46] <ParkerJohnston> does not make sense
[22:55:58] <ParkerJohnston> right now it is returning a mongocursor
[22:57:05] <cheeser> what is? history.next()?
[22:57:28] <cheeser> that history there has no relation to the "history" field in your document
[22:57:36] <cheeser> it's just the java vriable name you picked.
[22:57:37] <ParkerJohnston> MongoCursor<Document> history = History.getAllActiveHistoryByCaseNumber("2015-REP-01-0001");
[22:57:44] <ParkerJohnston> correct
[22:58:08] <cheeser> so history.next().get("history") gets you that array
[22:58:34] <ParkerJohnston> yes, but i cant set that to a List<Document>
[22:59:20] <ParkerJohnston> nevermind..had another error
[22:59:24] <ParkerJohnston> yes that makes sense now
[23:00:17] <ParkerJohnston> thanks!
[23:00:19] <ParkerJohnston> great help
[23:00:56] <cheeser> all set?
[23:01:38] <ParkerJohnston> for the next few hours ;)
[23:01:46] <ParkerJohnston> thanks!
[23:02:00] <ParkerJohnston> trying to figure out the amazingness that is mongo
[23:02:23] <cheeser> cool.
[23:04:04] <GothAlice> Much amaze, all the data, super wow.