PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 6th of June, 2016

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[08:41:16] <ShekharReddy> hey guys how can i start the mongodb on ubuntu 14.04
[08:43:39] <ShekharReddy> hey guys how can i start the mongodb on ubuntu 14.04, i have started the service using sudo service mongod start
[08:43:59] <ShekharReddy> but i am unable to get the console to use the db inside it s
[08:44:09] <ShekharReddy> i am lill new here using mongo
[09:09:36] <chris|> the documentation about backup up a secondary using filesystem snapshot says I "may skip" calling fsyncLock on it if journaling is enabled. Should I read that as "you may skip, but it's recommended not to" or is it safe to do so? Also, if I do not lock the secondary, should I freeze it ?
[09:21:42] <Derick> chris|: it's recommended to do fsynclock still
[09:22:15] <Derick> it'd be even better to do it from a hidden node...
[09:25:56] <chris|> yeah, but that's not an option atm
[09:26:17] <chris|> quick followup, why is this important? When calling db.fsyncLock(), ensure that the connection is kept open to allow a subsequent call.
[09:28:16] <Derick> because the lock is released when you close the connection I think
[09:28:26] <Derick> (to prevent random locks having been set on the DB)
[09:30:21] <chris|> that would be strange, as the docs also say: Closing the connection may make it difficult to release the lock. ;)
[09:30:59] <Derick> oh
[09:31:00] <Derick> hah
[09:31:03] <Derick> I don't know that ;-)
[09:43:02] <chris|> well, it seems to work just fine
[09:43:57] <Derick> have a link to these docs? I'll ping the tech team for the reason
[09:45:53] <chris|> sure: https://docs.mongodb.com/manual/reference/command/fsync/#impact-on-read-operations
[09:50:32] <Derick> ah, it can block reads too
[10:19:32] <chris|> true, but when does that become an issue with node locks? authentication?
[10:43:38] <joannac> chris|: yes, authentication
[10:44:07] <joannac> fsyncLock(), queued writes, reads (i.e. auth checks) will queue behind the queued writes
[12:32:44] <miracee> Hello, we organise a database dev room on FrOSCon (Germany) and I am looking for speakers ... we want to provide multiple systems.
[12:41:33] <zzookk> hello guys. tell me pls. "find().sort({"thisfieldname":-1}).limit(1)" and find.max - same constructions if i want to get highest value?
[12:42:51] <zzookk> never used mongo before :(
[13:37:39] <StephenLynx> wat
[13:37:50] <StephenLynx> the first query seems to be right for that.
[13:38:27] <StephenLynx> but I never seem anything like the command max for a cursor before.
[13:38:38] <StephenLynx> kek he left
[13:38:39] <StephenLynx> v:
[14:06:17] <KostyaSha> can mongo use multiple cores during index build?
[14:56:14] <cppking> hello guys, I got a question; if a replSet's master (A) down, then one secondary(B) will become master, if node A has a priority larger than any other node. what happens when A start after 10 hours and join the replSet, does data of 10 hours will lost ??
[15:06:30] <integer`> cppking, single node can't become primary
[15:06:37] <integer`> cppking, check votes parameter
[15:07:18] <KostyaSha> cppking, i may mistaken, but firstly it do election based on votes and only then priority influence
[15:38:00] <cppking> KostyaSha: Do you understand me? My replSet has 3 node (A B C), C is arbiter, A has a priority larger than B, if Master A down , B will become master, 10 hours later, If I start A, will data produced in 10 hours be lost ??
[15:40:08] <cppking> because when A start, A will become master, then data is inconsist between A and B, so data will be lost;
[16:11:02] <cppking> integer`: any help??
[16:11:20] <cppking> anybody here? help me
[16:25:16] <cheeser> Morphia 1.2.0 is out! https://github.com/mongodb/morphia/releases/tag/r1.2.0
[16:45:41] <troll_king> hi
[16:46:00] <troll_king> i need advice regrading mongo and nodejs
[16:47:00] <troll_king> my code is mongojs('irclogs').collection('logs') but when i console.log it show only 1 function
[16:47:05] <troll_king> which is _get
[16:47:34] <troll_king> so it is not connect?
[16:48:12] <troll_king> i'm using mongojs
[16:49:40] <troll_king> when i do save data in it it show mongodb.DB.connect is not a function
[16:50:09] <troll_king> console.log show [function] when i test save function
[16:50:28] <troll_king> any advice or suggestion is appreciated
[16:55:58] <StephenLynx> id recommend you to use mongodb
[17:20:45] <renlo> This might be a stupid question: at my work we have a Python web application. This web application fetches data from Mongo, then it serializes the data into a JSON format. Thing is, the number of documents that are serialized can be ~ 20k documents. When Python serializes the data, the server goes to 100% CPU usage, and it takes a number of seconds (5-10). Is there a way to get Mongo to be output as JSON?
[17:21:21] <renlo> or is that just from using a Python client which is automagically turning JSON documents into some Python dict, and then its getting turned into JSON documents again?
[17:27:07] <renlo> is there a place which is actually active for mongo related questions?
[17:27:15] <renlo> every time I'm in here its a ghost town
[17:28:01] <StephenLynx> kek
[17:28:12] <cheeser> it's quite active actually
[17:28:13] <StephenLynx> renlo, mongo outputs BSON
[17:28:23] <StephenLynx> and the driver does w/e it can with the output.
[17:28:37] <renlo> but its probably converting it into a dict, right?
[17:28:45] <StephenLynx> in your case, it has to convert this BSON into python objects.
[17:28:56] <StephenLynx> I know dicks about python so I don't know any details of that.
[17:29:07] <StephenLynx> and that has some overhead.
[17:29:07] <renlo> ah, dict ~ a JS object
[17:29:14] <StephenLynx> are you using python 3?
[17:29:18] <renlo> python 2.8
[17:29:21] <renlo> * 2.7
[17:29:25] <StephenLynx> so it shouldn't be TOO slow.
[17:29:45] <StephenLynx> if you want, you could try with a more CPU-efficient runtime environment, like node.
[17:29:59] <renlo> yeah I might try that
[17:30:04] <StephenLynx> or just try with the terminal command.
[17:30:07] <cheeser> by the time you see the structs in python, they're already python dicts and not bson/json
[17:30:22] <cheeser> so if you're converting those dicts to something and *then* to a json string, you're doing too much.
[17:30:36] <StephenLynx> and see if there's anyway for you to make it output json directly
[17:30:39] <cheeser> jesse says he can get ~50k docs/sec on his laptop
[17:30:44] <renlo> wow nice
[17:31:32] <renlo> cheeser: so its safe to say I am probably first converting BSON to a Python object / dict, then converting the Python object / dict to JSON?
[17:31:38] <StephenLynx> yes.
[17:31:39] <cheeser> now, those are probably not *large* docs but large and small are fairly subjective
[17:32:00] <StephenLynx> your driver is doing the first part and your application the second one.
[17:32:04] <cheeser> renlo: yes. the driver is already doing the BSON document -> python dict for you
[17:32:24] <renlo> okay thanks guys, I'll see if I can a) convert the BSON straight to JSON with the Python driver, and if that doesnt work b) look into using node
[17:32:45] <StephenLynx> c) do it using any tools mongo offers, like mongo export
[17:33:18] <StephenLynx> I think that would be the fastest to obtain a json string from the data.
[17:33:26] <renlo> it's a Django web app, how would you do that? Spawn a subprocess which runs the mongoexport command?
[17:33:34] <StephenLynx> yes.
[17:33:40] <StephenLynx> you could get the string from stdout
[17:33:41] <cheeser> yeesh
[17:33:47] <StephenLynx> I do that a lot on node
[17:33:58] <StephenLynx> to run system tools, like tesseract or imagemagick
[17:33:59] <renlo> cheeser: is that a 'yeesh' as in 'my god dont do that' or a 'yeesh' as in 'yes' lol
[17:34:04] <StephenLynx> kek
[17:34:24] <cheeser> it's a "that should be a last resort"
[17:34:29] <renlo> k
[17:34:41] <StephenLynx> I don't think its too bad, IMO.
[17:34:45] <cheeser> exec()ing out like that could potentially eat up any speed gains of using mongoexport
[17:35:00] <StephenLynx> hm, that I didn't know.
[17:35:02] <cheeser> but as with all perf questions, profile. don't guess.
[17:35:23] <StephenLynx> what is there about exec and the likes that eat up performance?
[17:35:42] <cheeser> fwiw, the java driver is much faster than mongodump so it's not necessarily safe to assume the tools are faster than the drivers.
[17:35:49] <StephenLynx> hm
[17:36:01] <StephenLynx> what about mongoexport?
[17:36:04] <cheeser> there's overhead in spawning external processes and reading/writing data
[17:36:09] <StephenLynx> because mongodump does a lot of tasks.
[17:36:16] <StephenLynx> like index stuff
[17:36:21] <cheeser> this particular isse was with mongodump specifically. we didn't profile mongoexport
[17:38:33] <cheeser> doesn't look like pymongo allows access to the BsonDocument...
[17:40:56] <renlo> really?
[17:41:50] <cheeser> i'm not super familiar with the docs but I'm not seeing anything.
[17:42:20] <cheeser> e.g., the java driver exposes a type called BsonDocument which still has the documents in their BSON form and not, say, a Map
[17:42:51] <cheeser> so there'd be no conversion of types from their BSON types to native Java types. Given this, you can encode them JSON without double decoding.
[17:43:01] <cheeser> in the pymongo docs, i'm not seeing such a type.
[17:43:05] <cheeser> jesse's back. i'll ask.
[17:43:54] <renlo> I found this: https://api.mongodb.com/python/current/api/bson/json_util.html , I think if I pass it the pymongo query (cursor?) then it will do it without converting it to a dict first, though I am not 100% sure aboout that
[17:53:42] <cheeser> talking to jesse, it sound like that probably does the python dict step, too.
[17:54:09] <cheeser> he said they're considering a "BsonJsonCodec" in the next quarter or two but they're still in the discussion phase.
[17:54:44] <cheeser> he said to email him at jesse@mongodb.com and introduce yourself. he said they're definitely eyeballing this use case and this could help them move forward.
[17:55:31] <renlo> for sure, thank you
[17:55:48] <cheeser> np
[17:56:05] <cheeser> also, run this in your code: pymongo.has_c()
[17:56:31] <cheeser> if that returns false, the C extensions aren't being used and you're running the pure python bson code which is much slower than the C libs
[17:58:05] <renlo> ah, that returned True, thanks again
[17:58:53] <cheeser> oh, that's kind of a disappointment because you could've had a huge win by recompiling with the C extensions. :)
[17:59:15] <renlo> haha yeah
[17:59:29] <renlo> I think I'm going to end up using some other driver...is the Go driver any good?
[17:59:53] <renlo> I guess its not official, I'm referring to https://labix.org/mgo
[18:00:02] <cheeser> it's official, yes.
[18:00:12] <cheeser> we use that in all our command line tools, e.g.
[18:00:17] <renlo> oh nevermind then
[18:00:27] <renlo> its good though?
[18:00:41] <cheeser> it is. if you like Go, at least.
[18:00:53] <cheeser> it's pretty much your only option in Go
[18:02:33] <Forbidd3n> Hey everyone. Quick question. Is it best to store all elements in one data set or have multiple datasets linked by referential ID?
[18:03:11] <Forbidd3n> By data set I mean collection
[18:03:46] <cheeser> you should avoid using references as the whole point of mongodb is to store self-contained documents.
[18:04:06] <cheeser> each external reference is another round trip to the database to fetch that document.
[18:04:11] <Forbidd3n> ok, so it is best to have sub/sub/sub objects
[18:10:04] <Forbidd3n> cheeser: for example- {"schools":{"Bears":{"classes":{"Math":{"days":["M","T","W"]},"Science":{"days":["M","W"]},"English":{"days":["T","W"]}}
[18:10:41] <Forbidd3n> I know that isn't correct format, but my question is really the depth of a collection, is it relevant?
[18:12:13] <cheeser> exactly how to model your documents is hard outside the context what's in them and how they're going to be used and queried.
[18:12:39] <cheeser> i can point you to docs but other than that, it's difficult to get too exact
[18:13:49] <Forbidd3n> cheeser: I understand. I will be querying different levels using the Slim framework API and updating multiple levels
[18:18:32] <Forbidd3n> cheeser: http://blog.mongodb.org/post/87200945828/6-rules-of-thumb-for-mongodb-schema-design-part-1 : here it says if you are going to have One-to-Squillions then you should reference by ID from another collection
[18:19:36] <Forbidd3n> so static data I think I can put in one collection, and consistent changing data in it's own collection so it can be updated by ID, remove or new ones added easier than filter through other data.
[18:20:18] <StephenLynx> Forbidd3n, in that example I wouldn't nest it so deep.
[18:20:28] <StephenLynx> i'd make a separate collection before that.
[18:21:00] <Forbidd3n> StephenLynx: that is what I thought as well.
[18:39:15] <Forbidd3n> StephenLynx: if a sub item in the collection needs to be the reference to the other collection, then I would need the sub item to be its own collection as well, correct?
[18:40:05] <StephenLynx> not necessarily, but that would be the most practical way to do it.
[18:40:26] <StephenLynx> keep in mind mongodb doesn't implement references.
[18:40:33] <StephenLynx> so either way you are implementing this reference on application code.
[18:40:44] <Forbidd3n> StephenLynx: correct. I would have to store the id of the referenced collections
[18:40:48] <StephenLynx> be it a document on a collection or a sub-document on another.
[18:40:51] <StephenLynx> not necessarily.
[18:41:03] <StephenLynx> you could reference it using any field of any type you wish.
[18:41:18] <StephenLynx> _id is just a field that is there by default and is unique.
[18:41:38] <Forbidd3n> StephenLynx: the project I am working on is Cruise Lines, Cruise Ships, Itineraries.
[18:41:54] <Forbidd3n> the itineraries are linked to the ships, not the lines
[18:41:58] <StephenLynx> sometimes you don't want to use _id because you want something readable for example.
[18:43:49] <Forbidd3n> StephenLynx: so I wouldn't store the object id on the related collection?
[18:44:04] <StephenLynx> if you wanted not to, you wouldn't.
[18:44:43] <Forbidd3n> ?
[18:44:43] <StephenLynx> because mongo doesn't implement relations, you can do anything you wish for that.
[18:45:01] <StephenLynx> for example, you have users and their login is unique
[18:45:14] <StephenLynx> for a given task you want to display the related user's login
[18:45:31] <StephenLynx> if you reference it by the _id you will have to get the user's login from that id to do so
[18:45:40] <StephenLynx> if you reference by the login directly, you don't have to.
[18:45:50] <StephenLynx> you just display the referenced login.
[18:46:14] <StephenLynx> since the login is unique, its as good as _id or any other unique index when referencing users.
[18:46:38] <Forbidd3n> gotcha, so in this case you can request a user by 'loginname' and then you have the _id to get related records if needed
[18:47:03] <StephenLynx> or not even that, you can relate the user everywhere by its login.
[18:47:10] <Forbidd3n> gotcha
[18:47:21] <Forbidd3n> in my case I would be using the _id
[18:47:46] <Forbidd3n> I think it would be best since my record will be a title string
[18:47:47] <StephenLynx> make sure to store an ObjectId for the relation then
[18:48:01] <StephenLynx> instead of a string representing the ObjectId
[18:48:18] <StephenLynx> that way you don't have to instance and ObjectId from the string when querying
[18:48:28] <StephenLynx> instance an*
[18:49:28] <Forbidd3n> so in the CruiseLine collection I would store Ships which would be 'ID' from the ship collection and in the ship collection I would store Schedule which would be each schedule for the ship - am I on the right track here?
[18:50:20] <StephenLynx> dunno, i'd need a sketch of sorts for the model
[18:50:36] <Forbidd3n> StephenLynx: ok will create a simple schema to show
[18:59:02] <Forbidd3n> StephenLynx: something like this - http://pastebin.com/sZSCkjRP
[18:59:29] <Forbidd3n> the first set would be the line collection and the second the ship collection
[18:59:44] <Forbidd3n> there will be a third for schedules
[19:01:05] <StephenLynx> eh
[19:01:10] <StephenLynx> I think you messed up there
[19:01:23] <Forbidd3n> they are two different objects
[19:01:24] <StephenLynx> it seems you have just a big document with several keys
[19:01:38] <Forbidd3n> it is pseudo concept
[19:01:49] <StephenLynx> :v
[19:01:59] <StephenLynx> why a pseudo concept instead of just the concept?
[19:02:14] <Forbidd3n> StephenLynx: because I didn't want to lay it all out if the concept was incorrect
[19:02:58] <StephenLynx> I still can't be sure if you don't write what you actually mean.
[19:03:11] <StephenLynx> https://gitgud.io/LynxChan/LynxChan/blob/master/doc/Model.txt
[19:03:13] <Forbidd3n> basically I would be storing the lines in a collection and referencing the ships from it's own collection
[19:03:14] <StephenLynx> for example
[19:04:27] <renlo> what are the fastest Mongo drivers? I think cheeser mentioned that the Java driver was faster than mongoexport or something? Anyone have any experience with the different drivers?
[19:04:31] <Forbidd3n> StephenLynx: in your example where are you linking the threads to the board?
[19:05:34] <StephenLynx> see the very bottom
[19:05:39] <cheeser> java, c#, go, i believe are the top 3, renlo
[19:06:25] <Forbidd3n> StephenLynx: bottom of the thread or bottom of the page?
[19:07:06] <cheeser> there's one benchmark that showed php outperforming all others but given php's ... nature I don't think anyone besides Derick and Jeremy trust that one
[19:07:13] <renlo> lol
[19:07:17] <StephenLynx> of the document
[19:08:11] <Forbidd3n> StephenLynx: what section? I understand what you mean by laying it out, which I can do, but trying to figure out how to link the tables and wanted to see how you are doing it
[19:08:37] <StephenLynx> the whole document
[19:08:53] <StephenLynx> line 320
[19:09:45] <Forbidd3n> StephenLynx: so you link thread to board by boardUri?
[19:09:57] <StephenLynx> yes
[19:15:58] <Forbidd3n> StephenLynx: how does this look - http://pastebin.com/VUZ50UEK
[19:16:34] <Forbidd3n> there will be more elements for each collection, but just put that for starters
[19:17:37] <StephenLynx> will line have more stuff to it?
[19:18:00] <Forbidd3n> yeah, like weight, max occupancy and so on
[19:18:03] <StephenLynx> ok
[19:18:16] <Forbidd3n> well wait that will be for ships sorry
[19:18:25] <Forbidd3n> line will have contact info mostly
[19:18:30] <StephenLynx> ok
[19:18:38] <StephenLynx> each line can have multiple ships on it
[19:18:39] <StephenLynx> ?
[19:18:43] <Forbidd3n> correct
[19:18:50] <Forbidd3n> and each ship has multiple schedules
[19:18:53] <StephenLynx> ok
[19:19:01] <StephenLynx> yeah, its looking alright.
[19:19:04] <Forbidd3n> but the schedule is linked to the ship not the line
[19:19:23] <StephenLynx> btw, its _id
[19:19:24] <Forbidd3n> this is why I think I need more than one collection
[19:19:32] <Forbidd3n> yeah, I noticed that, thanks
[19:20:02] <Forbidd3n> so this route is good?
[19:20:06] <StephenLynx> seems so.
[19:20:13] <Forbidd3n> ok, thanks
[19:20:17] <StephenLynx> if your future requirements are what you told me.
[19:20:27] <Forbidd3n> they are
[19:20:51] <StephenLynx> also
[19:20:57] <Forbidd3n> the api will allow the user to search a schedule by port and get all the ships that go there
[19:21:02] <StephenLynx> I would specify the type of any field that isn't a string or _id
[19:21:10] <Forbidd3n> gotcha
[19:21:15] <StephenLynx> arrives and departs
[19:21:27] <Forbidd3n> I like how this is laid out, thanks for the concept layout
[19:21:30] <StephenLynx> np
[19:21:40] <StephenLynx> and remember to always use actual date objects for dates.
[19:21:55] <StephenLynx> I started out using strings and that didn't pan so well :v
[19:23:40] <Forbidd3n> you are referring to DateTime objects in PHP
[19:23:48] <StephenLynx> not really.
[19:23:49] <Forbidd3n> something like that
[19:23:54] <Forbidd3n> hmm
[19:24:00] <StephenLynx> I'm referring to the types that are used when storing on mongo.
[19:24:00] <Forbidd3n> ok, how do you store yours?
[19:24:16] <Forbidd3n> ahh, I have to look that up
[19:24:20] <StephenLynx> how the types of your code are converted is up to the driver.
[19:24:29] <StephenLynx> in js, for example
[19:24:34] <StephenLynx> I use Date objects.
[19:27:28] <Forbidd3n> I will be creating my json code from php arrays so I have to figure out the format for that to store in mongo
[19:27:57] <StephenLynx> yeah, consult the driver and then check on the terminal client for the types being used.
[19:28:12] <Forbidd3n> will do thanks for all the help StephenLynx!
[19:28:15] <StephenLynx> if I am not mistaken, PHP driver has its own types for some stuff, like dates.
[19:30:51] <jokke> hello
[19:32:10] <jokke> i'm trying to collect some metrics from my mongodb instance. For this i run collStats on every collection in every database
[19:32:14] <jokke> it's quite a lot
[19:32:28] <jokke> and at some point i get an error: too many files open
[19:33:41] <jokke> is there anything i can do (besides raising the limit)
[19:35:44] <Forbidd3n> StephenLynx: yeah, there is a MongoDate class
[19:40:13] <StephenLynx> pay attention to those, so the type used on the db is the type you set out to use on the model.
[20:05:32] <zylo4747> is there any way to create a free tier cloud manager deployment or is that no longer possible?
[20:11:11] <cheeser> i think there's a trial period.
[20:11:17] <cheeser> 14d or some such
[20:19:51] <zylo4747> ok so after that it starts charging you
[20:20:07] <cheeser> yes
[20:21:22] <zylo4747> thanks
[20:22:36] <kurushiyama> zylo4747: I strongly advice against setting up your cluster with CM if you do not plan to continue to use it.
[20:28:40] <bros> MongoError: server local_network_ip:27017 timed out
[20:36:47] <bros> How do I handle this/fix this?
[20:40:48] <StephenLynx> fix the connectivity issue
[20:41:33] <kurushiyama> StephenLynx: Huh?
[20:41:46] <StephenLynx> wot
[20:49:19] <jokke> or do you suspect it's a driver issue?
[20:52:09] <kurushiyama> StephenLynx: Sorry, "fix the connectivity issue" was context-free for me and did not get it, hence I asked.
[20:54:04] <StephenLynx> it was for bros
[20:54:09] <StephenLynx> that had a dc from his db
[21:03:10] <bros> StephenLynx: is that what it is?
[21:03:14] <bros> it couldn't be anything else, right?
[21:09:29] <ebarault> hello
[21:14:41] <ebarault> can this channel be used to ask for support?
[21:15:12] <gaboesquivel> Hi There
[21:15:59] <Derick> ebarault: just ask - if somebody knows, they'll answer
[21:17:51] <gaboesquivel> is there any way I can $group and $push a nested array ?
[21:17:54] <gaboesquivel> https://gist.github.com/gaboesquivel/06466dfb84885f5b59b693ed71096d0a
[21:18:14] <ebarault> well, i'm having random "MongoError: not authorized on XXX to execute command {YYY}" with mongo 3.2, since very recently
[21:18:19] <gaboesquivel> see what I'm trying to do with 'cell_options.environment' there
[21:18:41] <ebarault> does it ring a bell to anyone?
[21:19:51] <gaboesquivel> I only need to filter the result set based on the values contained in that field.
[21:20:53] <gaboesquivel> for instance all documents where cell_options.environment contains either {value: 'Water' } or {value: 'Gym'}
[22:07:13] <poz2k4444> Hi guys, can somebody can help me with a problem I have with mongo-connector and elastic search?
[22:32:12] <hyperboreean> hey guys, any patterns on how to iterate really big collections ? I have one which has almost 900M documents and is close to imposible to go over it, even without any filtering
[22:46:51] <gaboesquivel> thanks in advanced if anyone can help me out with that query, http://stackoverflow.com/questions/37667901/how-to-search-for-documents-containing-objects-with-specific-values-in-a-nested